Finance Quant Agent¶

🥇The First Data-Centric Quant Multi-Agent Framework RD-Agent(Q)¶

R&D-Agent for Quantitative Finance, in short RD-Agent(Q), is the first data-centric, multi-agent framework designed to automate the full-stack research and development of quantitative strategies via coordinated factor-model co-optimization.

You can learn more details about RD-Agent(Q) through the paper.

⚡ Quick Start¶

Before you start, please make sure you have installed RD-Agent and configured the environment for RD-Agent correctly. If you want to know how to install and configure the RD-Agent, please refer to the documentation.

Then, you can run the framework by running the following command:

🐍 Create a Conda Environment
- Create a new conda environment with Python (3.10 and 3.11 are well tested in our CI):
```
conda create -n rdagent python=3.10
```
- Activate the environment:
```
conda activate rdagent
```
📦 Install the RDAgent
- You can install the RDAgent package from PyPI:
```
pip install rdagent
```
🚀 Run the Application
- You can directly run the application by using the following command:
```
rdagent fin_quant
```

🛠️ Usage of modules¶

Env Config

The following environment variables can be set in the .env file to customize the application’s behavior:

pydantic settings rdagent.app.qlib_rd_loop.conf.QuantBasePropSetting¶

Show JSON schema

{
   "title": "QuantBasePropSetting",
   "type": "object",
   "properties": {
      "scen": {
         "default": "rdagent.scenarios.qlib.experiment.quant_experiment.QlibQuantScenario",
         "title": "Scen",
         "type": "string"
      },
      "knowledge_base": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "Knowledge Base"
      },
      "knowledge_base_path": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "Knowledge Base Path"
      },
      "hypothesis_gen": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "Hypothesis Gen"
      },
      "interactor": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "Interactor"
      },
      "hypothesis2experiment": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "Hypothesis2Experiment"
      },
      "coder": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "Coder"
      },
      "runner": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "Runner"
      },
      "summarizer": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "Summarizer"
      },
      "evolving_n": {
         "default": 10,
         "title": "Evolving N",
         "type": "integer"
      },
      "quant_hypothesis_gen": {
         "default": "rdagent.scenarios.qlib.proposal.quant_proposal.QlibQuantHypothesisGen",
         "title": "Quant Hypothesis Gen",
         "type": "string"
      },
      "model_hypothesis2experiment": {
         "default": "rdagent.scenarios.qlib.proposal.model_proposal.QlibModelHypothesis2Experiment",
         "title": "Model Hypothesis2Experiment",
         "type": "string"
      },
      "model_coder": {
         "default": "rdagent.scenarios.qlib.developer.model_coder.QlibModelCoSTEER",
         "title": "Model Coder",
         "type": "string"
      },
      "model_runner": {
         "default": "rdagent.scenarios.qlib.developer.model_runner.QlibModelRunner",
         "title": "Model Runner",
         "type": "string"
      },
      "model_summarizer": {
         "default": "rdagent.scenarios.qlib.developer.feedback.QlibModelExperiment2Feedback",
         "title": "Model Summarizer",
         "type": "string"
      },
      "factor_hypothesis2experiment": {
         "default": "rdagent.scenarios.qlib.proposal.factor_proposal.QlibFactorHypothesis2Experiment",
         "title": "Factor Hypothesis2Experiment",
         "type": "string"
      },
      "factor_coder": {
         "default": "rdagent.scenarios.qlib.developer.factor_coder.QlibFactorCoSTEER",
         "title": "Factor Coder",
         "type": "string"
      },
      "factor_runner": {
         "default": "rdagent.scenarios.qlib.developer.factor_runner.QlibFactorRunner",
         "title": "Factor Runner",
         "type": "string"
      },
      "factor_summarizer": {
         "default": "rdagent.scenarios.qlib.developer.feedback.QlibFactorExperiment2Feedback",
         "title": "Factor Summarizer",
         "type": "string"
      },
      "action_selection": {
         "default": "bandit",
         "title": "Action Selection",
         "type": "string"
      },
      "train_start": {
         "default": "2008-01-01",
         "title": "Train Start",
         "type": "string"
      },
      "train_end": {
         "default": "2014-12-31",
         "title": "Train End",
         "type": "string"
      },
      "valid_start": {
         "default": "2015-01-01",
         "title": "Valid Start",
         "type": "string"
      },
      "valid_end": {
         "default": "2016-12-31",
         "title": "Valid End",
         "type": "string"
      },
      "test_start": {
         "default": "2017-01-01",
         "title": "Test Start",
         "type": "string"
      },
      "test_end": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": "2020-08-01",
         "title": "Test End"
      }
   },
   "additionalProperties": false
}

Config:

env_prefix: str = QLIB_QUANT_
protected_namespaces: tuple = ()

field action_selection: str = 'bandit'¶: Action selection strategy: ‘bandit’ for bandit-based selection, ‘llm’ for LLM-based selection, ‘random’ for random selection

field evolving_n: int = 10¶: Number of evolutions

field factor_coder: str = 'rdagent.scenarios.qlib.developer.factor_coder.QlibFactorCoSTEER'¶: Coder class

field factor_hypothesis2experiment: str = 'rdagent.scenarios.qlib.proposal.factor_proposal.QlibFactorHypothesis2Experiment'¶: Hypothesis to experiment class

field factor_runner: str = 'rdagent.scenarios.qlib.developer.factor_runner.QlibFactorRunner'¶: Runner class

field factor_summarizer: str = 'rdagent.scenarios.qlib.developer.feedback.QlibFactorExperiment2Feedback'¶: Summarizer class

field model_coder: str = 'rdagent.scenarios.qlib.developer.model_coder.QlibModelCoSTEER'¶: Coder class

field model_hypothesis2experiment: str = 'rdagent.scenarios.qlib.proposal.model_proposal.QlibModelHypothesis2Experiment'¶: Hypothesis to experiment class

field model_runner: str = 'rdagent.scenarios.qlib.developer.model_runner.QlibModelRunner'¶: Runner class

field model_summarizer: str = 'rdagent.scenarios.qlib.developer.feedback.QlibModelExperiment2Feedback'¶: Summarizer class

field quant_hypothesis_gen: str = 'rdagent.scenarios.qlib.proposal.quant_proposal.QlibQuantHypothesisGen'¶: Hypothesis generation class

field scen: str = 'rdagent.scenarios.qlib.experiment.quant_experiment.QlibQuantScenario'¶: Scenario class for Qlib Model

field test_end: str | None = '2020-08-01'¶: End date of the test / backtest segment

field test_start: str = '2017-01-01'¶: Start date of the test / backtest segment

field train_end: str = '2014-12-31'¶: End date of the training segment

field train_start: str = '2008-01-01'¶: Start date of the training segment

field valid_end: str = '2016-12-31'¶: End date of the validation segment

field valid_start: str = '2015-01-01'¶: Start date of the validation segment

pydantic settings rdagent.components.coder.factor_coder.config.FactorCoSTEERSettings

Show JSON schema

{
   "title": "FactorCoSTEERSettings",
   "type": "object",
   "properties": {
      "coder_use_cache": {
         "default": false,
         "title": "Coder Use Cache",
         "type": "boolean"
      },
      "max_loop": {
         "default": 10,
         "title": "Max Loop",
         "type": "integer"
      },
      "fail_task_trial_limit": {
         "default": 20,
         "title": "Fail Task Trial Limit",
         "type": "integer"
      },
      "v1_query_former_trace_limit": {
         "default": 3,
         "title": "V1 Query Former Trace Limit",
         "type": "integer"
      },
      "v1_query_similar_success_limit": {
         "default": 3,
         "title": "V1 Query Similar Success Limit",
         "type": "integer"
      },
      "v2_query_component_limit": {
         "default": 1,
         "title": "V2 Query Component Limit",
         "type": "integer"
      },
      "v2_query_error_limit": {
         "default": 1,
         "title": "V2 Query Error Limit",
         "type": "integer"
      },
      "v2_query_former_trace_limit": {
         "default": 3,
         "title": "V2 Query Former Trace Limit",
         "type": "integer"
      },
      "v2_add_fail_attempt_to_latest_successful_execution": {
         "default": false,
         "title": "V2 Add Fail Attempt To Latest Successful Execution",
         "type": "boolean"
      },
      "v2_error_summary": {
         "default": false,
         "title": "V2 Error Summary",
         "type": "boolean"
      },
      "v2_knowledge_sampler": {
         "default": 1.0,
         "title": "V2 Knowledge Sampler",
         "type": "number"
      },
      "knowledge_base_path": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "Knowledge Base Path"
      },
      "new_knowledge_base_path": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "New Knowledge Base Path"
      },
      "enable_filelock": {
         "default": false,
         "title": "Enable Filelock",
         "type": "boolean"
      },
      "filelock_path": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "Filelock Path"
      },
      "max_seconds_multiplier": {
         "default": 1000000,
         "title": "Max Seconds Multiplier",
         "type": "integer"
      },
      "data_folder": {
         "default": "git_ignore_folder/factor_implementation_source_data",
         "title": "Data Folder",
         "type": "string"
      },
      "data_folder_debug": {
         "default": "git_ignore_folder/factor_implementation_source_data_debug",
         "title": "Data Folder Debug",
         "type": "string"
      },
      "simple_background": {
         "default": false,
         "title": "Simple Background",
         "type": "boolean"
      },
      "file_based_execution_timeout": {
         "default": 3600,
         "title": "File Based Execution Timeout",
         "type": "integer"
      },
      "select_method": {
         "default": "random",
         "title": "Select Method",
         "type": "string"
      },
      "python_bin": {
         "default": "python",
         "title": "Python Bin",
         "type": "string"
      }
   },
   "additionalProperties": false
}

Config:

env_prefix: str = FACTOR_CoSTEER_

field coder_use_cache: bool = False: Indicates whether to use cache for the coder

field data_folder: str = 'git_ignore_folder/factor_implementation_source_data': Path to the folder containing financial data (default is fundamental data in Qlib)

field data_folder_debug: str = 'git_ignore_folder/factor_implementation_source_data_debug': Path to the folder containing partial financial data (for debugging)

field enable_filelock: bool = False

field file_based_execution_timeout: int = 3600: Timeout in seconds for each factor implementation execution

field filelock_path: str | None = None

field knowledge_base_path: str | None = None: Path to the knowledge base

field max_loop: int = 10: Maximum number of task implementation loops

field max_seconds_multiplier: int = 1000000

field new_knowledge_base_path: str | None = None: Path to the new knowledge base

field python_bin: str = 'python': Path to the Python binary

field select_method: str = 'random': Method for the selection of factors implementation

field simple_background: bool = False: Whether to use simple background information for code feedback

field v2_add_fail_attempt_to_latest_successful_execution: bool = False

Qlib Configuration
- The .yaml files in both the model_template and factor_template directories contain some configurations for running the corresponding models or factors within the Qlib framework. Below is an overview of their contents and roles:
  
  General Settings:
  
  provider_uri: Specifies the local Qlib data path, set to ~/.qlib/qlib_data/cn_data.
  
  market: Configured to csi300, representing the CSI 300 index constituents.
  
  benchmark: Set to SH000300, used for backtesting evaluation.
  
  Data Handling:
  
  start_time and end_time: Define the full data range, from 2008-01-01 to 2022-08-01.
  
  fit_start_time: The start date for fitting the model, set to 2008-01-01.
  
  fit_end_time: The end date for fitting the model, set to 2014-12-31.
  
  features and labels: Generated via a nested data loader combining Alpha158DL (for engineered features such as RESI5, WVMA5, RSQR5, KLEN, etc.) and a StaticDataLoader that loads precomputed factor files (combined_factors_df.parquet).
  
  normalization: The pipeline includes RobustZScoreNorm (with clipping) and Fillna for inference, and DropnaLabel with CSZScoreNorm for training.
  
  Training Configuration:
  
  Model: Uses GeneralPTNN, a PyTorch-based neural network model.
  
  Dataset Splits:
  
  train: 2008-01-01 to 2014-12-31
  
  valid: 2015-01-01 to 2016-12-31
  
  test: 2017-01-01 to 2020-08-01
  
  Default Hyperparameters (can be overridden by command-line arguments):
  
  n_epochs: 100
  
  lr: 2e-4
  
  early_stop: 10
  
  batch_size: 256
  
  weight_decay: 0.0
  
  metric: loss
  
  loss: mse
  
  n_jobs: 20
  
  GPU: 0 (uses GPU 0 if available)
  
  Backtesting and Evaluation:
  
  strategy: TopkDropoutStrategy, which selects the top 50 stocks and randomly drops 5 to introduce exploration.
  
  backtest period: 2017-01-01 to 2020-08-01
  
  initial capital: 100,000,000
  
  cost configuration: Includes open/close costs, minimum transaction costs, and slippage control.
  
  Recording and Analysis:
  
  SignalRecord: Logs predicted signals.
  
  SigAnaRecord: Performs signal analysis without long-short separation.
  
  PortAnaRecord: Conducts portfolio analysis using the configured strategy and backtest settings.