Some checks are pending
Documentation / build-docs (push) Waiting to run
Tests / test (macos-latest, 3.11) (push) Waiting to run
Tests / test (macos-latest, 3.12) (push) Waiting to run
Tests / test (macos-latest, 3.13) (push) Waiting to run
Tests / test (macos-latest, 3.14) (push) Waiting to run
Tests / test (ubuntu-latest, 3.11) (push) Waiting to run
Tests / test (ubuntu-latest, 3.12) (push) Waiting to run
Tests / test (ubuntu-latest, 3.13) (push) Waiting to run
Tests / test (ubuntu-latest, 3.14) (push) Waiting to run
5.1 KiB
5.1 KiB
Machine Learning Improvements
This document describes the ML enhancements added to the intelligent autopilot system.
Overview
The ML improvements focus on making the strategy selection model more robust, interpretable, and adaptive to changing market conditions.
Components
1. Online Learning Pipeline
Location: src/autopilot/online_learning.py
Features:
- Incremental model updates from live trading data
- Concept drift detection using performance windows
- Buffered training samples for efficient batch updates
- Automatic full retraining on drift detection
Usage:
from src.autopilot.online_learning import get_online_learning_pipeline
pipeline = get_online_learning_pipeline(model)
# Add training sample after trade
await pipeline.add_training_sample(
market_conditions=conditions,
strategy_name="selected_strategy",
performance=trade_return
)
# Check for drift and retrain if needed
retrain_result = await pipeline.trigger_full_retrain_if_needed()
2. Confidence Calibration
Location: src/autopilot/confidence_calibration.py
Features:
- Platt scaling (logistic regression calibration)
- Isotonic regression calibration
- Probability distribution calibration
- Validation data integration
Methods:
Platt Scaling: Fast, parametric calibration using logistic regressionIsotonic Regression: Non-parametric, more flexible but requires more data
Usage:
from src.autopilot.confidence_calibration import get_confidence_calibration_manager
calibrator = get_confidence_calibration_manager()
# Fit from validation data
calibrator.fit_from_validation_data(
predicted_probs=[...],
true_labels=[...]
)
# Calibrate predictions
strategy, calibrated_conf, calibrated_preds = calibrator.calibrate_prediction(
strategy_name="strategy",
confidence=0.85,
all_predictions={...}
)
3. Model Explainability
Location: src/autopilot/explainability.py
Features:
- SHAP (SHapley Additive exPlanations) value integration
- Feature importance analysis (global and local)
- Prediction explanations with top contributing features
- Support for tree-based and kernel-based models
Usage:
from src.autopilot.explainability import get_model_explainer
explainer = get_model_explainer(model)
# Initialize with background data
explainer.initialize_explainer(background_data_df)
# Explain a prediction
explanation = explainer.explain_prediction(features)
# Returns: feature_importance, top_positive_features, top_negative_features, etc.
# Get global feature importance
global_importance = explainer.get_global_feature_importance()
4. Advanced Regime Detection
Location: src/autopilot/regime_detection.py
Features:
- Hidden Markov Models (HMM) for regime detection
- Gaussian Mixture Models (GMM) for regime detection
- Hybrid detection combining multiple methods
- Probabilistic regime predictions
Methods:
HMM: Models regime transitions as Markov processGMM: Clusters market states using Gaussian mixturesHybrid: Combines both methods for robust detection
Usage:
from src.autopilot.regime_detection import AdvancedRegimeDetector
detector = AdvancedRegimeDetector(method="hmm")
detector.fit_from_dataframe(ohlcv_df)
regime = detector.detect_regime(returns=0.01, volatility=0.02)
5. Enhanced Feature Engineering
Location: src/autopilot/feature_engineering.py
Enhancements:
- Multi-timeframe feature aggregation
- Order book feature extraction
- Feature interactions (products, ratios)
- Regime-specific feature engineering
- Lag features for temporal patterns
Integration
These components integrate with the existing IntelligentAutopilot and StrategySelector classes:
- Online Learning: Integrated via
_record_trade_for_learningmethod - Confidence Calibration: Applied in
select_best_strategymethod - Explainability: Available via API endpoints for UI visualization
- Regime Detection: Used in
MarketAnalyzerfor enhanced regime classification
Configuration
Configuration options in config/config.yaml:
autopilot:
intelligent:
online_learning:
drift_window: 100
drift_threshold: 0.1
buffer_size: 50
update_frequency: 100
confidence_calibration:
method: "isotonic" # or "platt"
regime_detection:
method: "hmm" # or "gmm" or "hybrid"
n_regimes: 4
Dependencies
Optional dependencies (with fallbacks):
hmmlearn: For HMM regime detectionshap: For model explainabilityscipy: For calibration methods (isotonic regression)
Performance Considerations
- Online Learning: Batches updates for efficiency (configurable buffer size)
- SHAP Values: Can be slow for large models; consider caching or background computation
- HMM/GMM: Training is fast, prediction is very fast
- Calibration: Fitting is fast, prediction is O(1)
Testing
Recommended testing approach:
- Use synthetic data for online learning pipeline
- Test calibration with known probability distributions
- Validate SHAP values against known feature importance
- Compare HMM/GMM regimes against rule-based classification