Finance Quant Agent¶
🥇The First Data-Centric Quant Multi-Agent Framework RD-Agent(Q)¶
R&D-Agent for Quantitative Finance, in short RD-Agent(Q), is the first data-centric, multi-agent framework designed to automate the full-stack research and development of quantitative strategies via coordinated factor-model co-optimization.
You can learn more details about RD-Agent(Q) through the paper.
⚡ Quick Start¶
Before you start, please make sure you have installed RD-Agent and configured the environment for RD-Agent correctly. If you want to know how to install and configure the RD-Agent, please refer to the documentation.
Then, you can run the framework by running the following command:
🐍 Create a Conda Environment
Create a new conda environment with Python (3.10 and 3.11 are well tested in our CI):
conda create -n rdagent python=3.10
Activate the environment:
conda activate rdagent
📦 Install the RDAgent
You can install the RDAgent package from PyPI:
pip install rdagent
🚀 Run the Application
You can directly run the application by using the following command:
rdagent fin_quant
🛠️ Usage of modules¶
Env Config
The following environment variables can be set in the .env file to customize the application’s behavior:
- pydantic settings rdagent.app.qlib_rd_loop.conf.QuantBasePropSetting¶
Show JSON schema
{ "title": "QuantBasePropSetting", "type": "object", "properties": { "scen": { "default": "rdagent.scenarios.qlib.experiment.quant_experiment.QlibQuantScenario", "title": "Scen", "type": "string" }, "knowledge_base": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "title": "Knowledge Base" }, "knowledge_base_path": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "title": "Knowledge Base Path" }, "hypothesis_gen": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "title": "Hypothesis Gen" }, "interactor": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "title": "Interactor" }, "hypothesis2experiment": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "title": "Hypothesis2Experiment" }, "coder": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "title": "Coder" }, "runner": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "title": "Runner" }, "summarizer": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "title": "Summarizer" }, "evolving_n": { "default": 10, "title": "Evolving N", "type": "integer" }, "quant_hypothesis_gen": { "default": "rdagent.scenarios.qlib.proposal.quant_proposal.QlibQuantHypothesisGen", "title": "Quant Hypothesis Gen", "type": "string" }, "model_hypothesis2experiment": { "default": "rdagent.scenarios.qlib.proposal.model_proposal.QlibModelHypothesis2Experiment", "title": "Model Hypothesis2Experiment", "type": "string" }, "model_coder": { "default": "rdagent.scenarios.qlib.developer.model_coder.QlibModelCoSTEER", "title": "Model Coder", "type": "string" }, "model_runner": { "default": "rdagent.scenarios.qlib.developer.model_runner.QlibModelRunner", "title": "Model Runner", "type": "string" }, "model_summarizer": { "default": "rdagent.scenarios.qlib.developer.feedback.QlibModelExperiment2Feedback", "title": "Model Summarizer", "type": "string" }, "factor_hypothesis2experiment": { "default": "rdagent.scenarios.qlib.proposal.factor_proposal.QlibFactorHypothesis2Experiment", "title": "Factor Hypothesis2Experiment", "type": "string" }, "factor_coder": { "default": "rdagent.scenarios.qlib.developer.factor_coder.QlibFactorCoSTEER", "title": "Factor Coder", "type": "string" }, "factor_runner": { "default": "rdagent.scenarios.qlib.developer.factor_runner.QlibFactorRunner", "title": "Factor Runner", "type": "string" }, "factor_summarizer": { "default": "rdagent.scenarios.qlib.developer.feedback.QlibFactorExperiment2Feedback", "title": "Factor Summarizer", "type": "string" }, "action_selection": { "default": "bandit", "title": "Action Selection", "type": "string" }, "train_start": { "default": "2008-01-01", "title": "Train Start", "type": "string" }, "train_end": { "default": "2014-12-31", "title": "Train End", "type": "string" }, "valid_start": { "default": "2015-01-01", "title": "Valid Start", "type": "string" }, "valid_end": { "default": "2016-12-31", "title": "Valid End", "type": "string" }, "test_start": { "default": "2017-01-01", "title": "Test Start", "type": "string" }, "test_end": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": "2020-08-01", "title": "Test End" } }, "additionalProperties": false }
- Config:
env_prefix: str = QLIB_QUANT_
protected_namespaces: tuple = ()
- field action_selection: str = 'bandit'¶
Action selection strategy: ‘bandit’ for bandit-based selection, ‘llm’ for LLM-based selection, ‘random’ for random selection
- field evolving_n: int = 10¶
Number of evolutions
- field factor_coder: str = 'rdagent.scenarios.qlib.developer.factor_coder.QlibFactorCoSTEER'¶
Coder class
- field factor_hypothesis2experiment: str = 'rdagent.scenarios.qlib.proposal.factor_proposal.QlibFactorHypothesis2Experiment'¶
Hypothesis to experiment class
- field factor_runner: str = 'rdagent.scenarios.qlib.developer.factor_runner.QlibFactorRunner'¶
Runner class
- field factor_summarizer: str = 'rdagent.scenarios.qlib.developer.feedback.QlibFactorExperiment2Feedback'¶
Summarizer class
- field model_coder: str = 'rdagent.scenarios.qlib.developer.model_coder.QlibModelCoSTEER'¶
Coder class
- field model_hypothesis2experiment: str = 'rdagent.scenarios.qlib.proposal.model_proposal.QlibModelHypothesis2Experiment'¶
Hypothesis to experiment class
- field model_runner: str = 'rdagent.scenarios.qlib.developer.model_runner.QlibModelRunner'¶
Runner class
- field model_summarizer: str = 'rdagent.scenarios.qlib.developer.feedback.QlibModelExperiment2Feedback'¶
Summarizer class
- field quant_hypothesis_gen: str = 'rdagent.scenarios.qlib.proposal.quant_proposal.QlibQuantHypothesisGen'¶
Hypothesis generation class
- field scen: str = 'rdagent.scenarios.qlib.experiment.quant_experiment.QlibQuantScenario'¶
Scenario class for Qlib Model
- field test_end: str | None = '2020-08-01'¶
End date of the test / backtest segment
- field test_start: str = '2017-01-01'¶
Start date of the test / backtest segment
- field train_end: str = '2014-12-31'¶
End date of the training segment
- field train_start: str = '2008-01-01'¶
Start date of the training segment
- field valid_end: str = '2016-12-31'¶
End date of the validation segment
- field valid_start: str = '2015-01-01'¶
Start date of the validation segment
- pydantic settings rdagent.components.coder.factor_coder.config.FactorCoSTEERSettings
Show JSON schema
{ "title": "FactorCoSTEERSettings", "type": "object", "properties": { "coder_use_cache": { "default": false, "title": "Coder Use Cache", "type": "boolean" }, "max_loop": { "default": 10, "title": "Max Loop", "type": "integer" }, "fail_task_trial_limit": { "default": 20, "title": "Fail Task Trial Limit", "type": "integer" }, "v1_query_former_trace_limit": { "default": 3, "title": "V1 Query Former Trace Limit", "type": "integer" }, "v1_query_similar_success_limit": { "default": 3, "title": "V1 Query Similar Success Limit", "type": "integer" }, "v2_query_component_limit": { "default": 1, "title": "V2 Query Component Limit", "type": "integer" }, "v2_query_error_limit": { "default": 1, "title": "V2 Query Error Limit", "type": "integer" }, "v2_query_former_trace_limit": { "default": 3, "title": "V2 Query Former Trace Limit", "type": "integer" }, "v2_add_fail_attempt_to_latest_successful_execution": { "default": false, "title": "V2 Add Fail Attempt To Latest Successful Execution", "type": "boolean" }, "v2_error_summary": { "default": false, "title": "V2 Error Summary", "type": "boolean" }, "v2_knowledge_sampler": { "default": 1.0, "title": "V2 Knowledge Sampler", "type": "number" }, "knowledge_base_path": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "title": "Knowledge Base Path" }, "new_knowledge_base_path": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "title": "New Knowledge Base Path" }, "enable_filelock": { "default": false, "title": "Enable Filelock", "type": "boolean" }, "filelock_path": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "title": "Filelock Path" }, "max_seconds_multiplier": { "default": 1000000, "title": "Max Seconds Multiplier", "type": "integer" }, "data_folder": { "default": "git_ignore_folder/factor_implementation_source_data", "title": "Data Folder", "type": "string" }, "data_folder_debug": { "default": "git_ignore_folder/factor_implementation_source_data_debug", "title": "Data Folder Debug", "type": "string" }, "simple_background": { "default": false, "title": "Simple Background", "type": "boolean" }, "file_based_execution_timeout": { "default": 3600, "title": "File Based Execution Timeout", "type": "integer" }, "select_method": { "default": "random", "title": "Select Method", "type": "string" }, "python_bin": { "default": "python", "title": "Python Bin", "type": "string" } }, "additionalProperties": false }
- Config:
env_prefix: str = FACTOR_CoSTEER_
- field coder_use_cache: bool = False
Indicates whether to use cache for the coder
- field data_folder: str = 'git_ignore_folder/factor_implementation_source_data'
Path to the folder containing financial data (default is fundamental data in Qlib)
- field data_folder_debug: str = 'git_ignore_folder/factor_implementation_source_data_debug'
Path to the folder containing partial financial data (for debugging)
- field enable_filelock: bool = False
- field file_based_execution_timeout: int = 3600
Timeout in seconds for each factor implementation execution
- field filelock_path: str | None = None
- field knowledge_base_path: str | None = None
Path to the knowledge base
- field max_loop: int = 10
Maximum number of task implementation loops
- field max_seconds_multiplier: int = 1000000
- field new_knowledge_base_path: str | None = None
Path to the new knowledge base
- field python_bin: str = 'python'
Path to the Python binary
- field select_method: str = 'random'
Method for the selection of factors implementation
- field simple_background: bool = False
Whether to use simple background information for code feedback
- field v2_add_fail_attempt_to_latest_successful_execution: bool = False
- Qlib Configuration
- The .yaml files in both the model_template and factor_template directories contain some configurations for running the corresponding models or factors within the Qlib framework. Below is an overview of their contents and roles:
- General Settings:
provider_uri: Specifies the local Qlib data path, set to ~/.qlib/qlib_data/cn_data.
market: Configured to csi300, representing the CSI 300 index constituents.
benchmark: Set to SH000300, used for backtesting evaluation.
- Data Handling:
start_time and end_time: Define the full data range, from 2008-01-01 to 2022-08-01.
fit_start_time: The start date for fitting the model, set to 2008-01-01.
fit_end_time: The end date for fitting the model, set to 2014-12-31.
features and labels: Generated via a nested data loader combining Alpha158DL (for engineered features such as RESI5, WVMA5, RSQR5, KLEN, etc.) and a StaticDataLoader that loads precomputed factor files (combined_factors_df.parquet).
normalization: The pipeline includes RobustZScoreNorm (with clipping) and Fillna for inference, and DropnaLabel with CSZScoreNorm for training.
- Training Configuration:
Model: Uses GeneralPTNN, a PyTorch-based neural network model.
- Dataset Splits:
train: 2008-01-01 to 2014-12-31
valid: 2015-01-01 to 2016-12-31
test: 2017-01-01 to 2020-08-01
- Default Hyperparameters (can be overridden by command-line arguments):
n_epochs: 100
lr: 2e-4
early_stop: 10
batch_size: 256
weight_decay: 0.0
metric: loss
loss: mse
n_jobs: 20
GPU: 0 (uses GPU 0 if available)
- Backtesting and Evaluation:
strategy: TopkDropoutStrategy, which selects the top 50 stocks and randomly drops 5 to introduce exploration.
backtest period: 2017-01-01 to 2020-08-01
initial capital: 100,000,000
cost configuration: Includes open/close costs, minimum transaction costs, and slippage control.
- Recording and Analysis:
SignalRecord: Logs predicted signals.
SigAnaRecord: Performs signal analysis without long-short separation.
PortAnaRecord: Conducts portfolio analysis using the configured strategy and backtest settings.