Finance Quant Agent

🥇The First Data-Centric Quant Multi-Agent Framework RD-Agent(Q)

R&D-Agent for Quantitative Finance, in short RD-Agent(Q), is the first data-centric, multi-agent framework designed to automate the full-stack research and development of quantitative strategies via coordinated factor-model co-optimization.

You can learn more details about RD-Agent(Q) through the paper.

⚡ Quick Start

Before you start, please make sure you have installed RD-Agent and configured the environment for RD-Agent correctly. If you want to know how to install and configure the RD-Agent, please refer to the documentation.

Then, you can run the framework by running the following command:

  • 🐍 Create a Conda Environment

    • Create a new conda environment with Python (3.10 and 3.11 are well tested in our CI):

      conda create -n rdagent python=3.10
      
    • Activate the environment:

      conda activate rdagent
      
  • 📦 Install the RDAgent

    • You can install the RDAgent package from PyPI:

      pip install rdagent
      
  • 🚀 Run the Application

    • You can directly run the application by using the following command:

      rdagent fin_quant
      

🛠️ Usage of modules

  • Env Config

The following environment variables can be set in the .env file to customize the application’s behavior:

pydantic settings rdagent.app.qlib_rd_loop.conf.QuantBasePropSetting

Show JSON schema
{
   "title": "QuantBasePropSetting",
   "type": "object",
   "properties": {
      "scen": {
         "default": "rdagent.scenarios.qlib.experiment.quant_experiment.QlibQuantScenario",
         "title": "Scen",
         "type": "string"
      },
      "knowledge_base": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "Knowledge Base"
      },
      "knowledge_base_path": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "Knowledge Base Path"
      },
      "hypothesis_gen": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "Hypothesis Gen"
      },
      "interactor": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "Interactor"
      },
      "hypothesis2experiment": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "Hypothesis2Experiment"
      },
      "coder": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "Coder"
      },
      "runner": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "Runner"
      },
      "summarizer": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "Summarizer"
      },
      "evolving_n": {
         "default": 10,
         "title": "Evolving N",
         "type": "integer"
      },
      "quant_hypothesis_gen": {
         "default": "rdagent.scenarios.qlib.proposal.quant_proposal.QlibQuantHypothesisGen",
         "title": "Quant Hypothesis Gen",
         "type": "string"
      },
      "model_hypothesis2experiment": {
         "default": "rdagent.scenarios.qlib.proposal.model_proposal.QlibModelHypothesis2Experiment",
         "title": "Model Hypothesis2Experiment",
         "type": "string"
      },
      "model_coder": {
         "default": "rdagent.scenarios.qlib.developer.model_coder.QlibModelCoSTEER",
         "title": "Model Coder",
         "type": "string"
      },
      "model_runner": {
         "default": "rdagent.scenarios.qlib.developer.model_runner.QlibModelRunner",
         "title": "Model Runner",
         "type": "string"
      },
      "model_summarizer": {
         "default": "rdagent.scenarios.qlib.developer.feedback.QlibModelExperiment2Feedback",
         "title": "Model Summarizer",
         "type": "string"
      },
      "factor_hypothesis2experiment": {
         "default": "rdagent.scenarios.qlib.proposal.factor_proposal.QlibFactorHypothesis2Experiment",
         "title": "Factor Hypothesis2Experiment",
         "type": "string"
      },
      "factor_coder": {
         "default": "rdagent.scenarios.qlib.developer.factor_coder.QlibFactorCoSTEER",
         "title": "Factor Coder",
         "type": "string"
      },
      "factor_runner": {
         "default": "rdagent.scenarios.qlib.developer.factor_runner.QlibFactorRunner",
         "title": "Factor Runner",
         "type": "string"
      },
      "factor_summarizer": {
         "default": "rdagent.scenarios.qlib.developer.feedback.QlibFactorExperiment2Feedback",
         "title": "Factor Summarizer",
         "type": "string"
      },
      "action_selection": {
         "default": "bandit",
         "title": "Action Selection",
         "type": "string"
      },
      "train_start": {
         "default": "2008-01-01",
         "title": "Train Start",
         "type": "string"
      },
      "train_end": {
         "default": "2014-12-31",
         "title": "Train End",
         "type": "string"
      },
      "valid_start": {
         "default": "2015-01-01",
         "title": "Valid Start",
         "type": "string"
      },
      "valid_end": {
         "default": "2016-12-31",
         "title": "Valid End",
         "type": "string"
      },
      "test_start": {
         "default": "2017-01-01",
         "title": "Test Start",
         "type": "string"
      },
      "test_end": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": "2020-08-01",
         "title": "Test End"
      }
   },
   "additionalProperties": false
}

Config:
  • env_prefix: str = QLIB_QUANT_

  • protected_namespaces: tuple = ()

field action_selection: str = 'bandit'

Action selection strategy: ‘bandit’ for bandit-based selection, ‘llm’ for LLM-based selection, ‘random’ for random selection

field evolving_n: int = 10

Number of evolutions

field factor_coder: str = 'rdagent.scenarios.qlib.developer.factor_coder.QlibFactorCoSTEER'

Coder class

field factor_hypothesis2experiment: str = 'rdagent.scenarios.qlib.proposal.factor_proposal.QlibFactorHypothesis2Experiment'

Hypothesis to experiment class

field factor_runner: str = 'rdagent.scenarios.qlib.developer.factor_runner.QlibFactorRunner'

Runner class

field factor_summarizer: str = 'rdagent.scenarios.qlib.developer.feedback.QlibFactorExperiment2Feedback'

Summarizer class

field model_coder: str = 'rdagent.scenarios.qlib.developer.model_coder.QlibModelCoSTEER'

Coder class

field model_hypothesis2experiment: str = 'rdagent.scenarios.qlib.proposal.model_proposal.QlibModelHypothesis2Experiment'

Hypothesis to experiment class

field model_runner: str = 'rdagent.scenarios.qlib.developer.model_runner.QlibModelRunner'

Runner class

field model_summarizer: str = 'rdagent.scenarios.qlib.developer.feedback.QlibModelExperiment2Feedback'

Summarizer class

field quant_hypothesis_gen: str = 'rdagent.scenarios.qlib.proposal.quant_proposal.QlibQuantHypothesisGen'

Hypothesis generation class

field scen: str = 'rdagent.scenarios.qlib.experiment.quant_experiment.QlibQuantScenario'

Scenario class for Qlib Model

field test_end: str | None = '2020-08-01'

End date of the test / backtest segment

field test_start: str = '2017-01-01'

Start date of the test / backtest segment

field train_end: str = '2014-12-31'

End date of the training segment

field train_start: str = '2008-01-01'

Start date of the training segment

field valid_end: str = '2016-12-31'

End date of the validation segment

field valid_start: str = '2015-01-01'

Start date of the validation segment

pydantic settings rdagent.components.coder.factor_coder.config.FactorCoSTEERSettings

Show JSON schema
{
   "title": "FactorCoSTEERSettings",
   "type": "object",
   "properties": {
      "coder_use_cache": {
         "default": false,
         "title": "Coder Use Cache",
         "type": "boolean"
      },
      "max_loop": {
         "default": 10,
         "title": "Max Loop",
         "type": "integer"
      },
      "fail_task_trial_limit": {
         "default": 20,
         "title": "Fail Task Trial Limit",
         "type": "integer"
      },
      "v1_query_former_trace_limit": {
         "default": 3,
         "title": "V1 Query Former Trace Limit",
         "type": "integer"
      },
      "v1_query_similar_success_limit": {
         "default": 3,
         "title": "V1 Query Similar Success Limit",
         "type": "integer"
      },
      "v2_query_component_limit": {
         "default": 1,
         "title": "V2 Query Component Limit",
         "type": "integer"
      },
      "v2_query_error_limit": {
         "default": 1,
         "title": "V2 Query Error Limit",
         "type": "integer"
      },
      "v2_query_former_trace_limit": {
         "default": 3,
         "title": "V2 Query Former Trace Limit",
         "type": "integer"
      },
      "v2_add_fail_attempt_to_latest_successful_execution": {
         "default": false,
         "title": "V2 Add Fail Attempt To Latest Successful Execution",
         "type": "boolean"
      },
      "v2_error_summary": {
         "default": false,
         "title": "V2 Error Summary",
         "type": "boolean"
      },
      "v2_knowledge_sampler": {
         "default": 1.0,
         "title": "V2 Knowledge Sampler",
         "type": "number"
      },
      "knowledge_base_path": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "Knowledge Base Path"
      },
      "new_knowledge_base_path": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "New Knowledge Base Path"
      },
      "enable_filelock": {
         "default": false,
         "title": "Enable Filelock",
         "type": "boolean"
      },
      "filelock_path": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "Filelock Path"
      },
      "max_seconds_multiplier": {
         "default": 1000000,
         "title": "Max Seconds Multiplier",
         "type": "integer"
      },
      "data_folder": {
         "default": "git_ignore_folder/factor_implementation_source_data",
         "title": "Data Folder",
         "type": "string"
      },
      "data_folder_debug": {
         "default": "git_ignore_folder/factor_implementation_source_data_debug",
         "title": "Data Folder Debug",
         "type": "string"
      },
      "simple_background": {
         "default": false,
         "title": "Simple Background",
         "type": "boolean"
      },
      "file_based_execution_timeout": {
         "default": 3600,
         "title": "File Based Execution Timeout",
         "type": "integer"
      },
      "select_method": {
         "default": "random",
         "title": "Select Method",
         "type": "string"
      },
      "python_bin": {
         "default": "python",
         "title": "Python Bin",
         "type": "string"
      }
   },
   "additionalProperties": false
}

Config:
  • env_prefix: str = FACTOR_CoSTEER_

field coder_use_cache: bool = False

Indicates whether to use cache for the coder

field data_folder: str = 'git_ignore_folder/factor_implementation_source_data'

Path to the folder containing financial data (default is fundamental data in Qlib)

field data_folder_debug: str = 'git_ignore_folder/factor_implementation_source_data_debug'

Path to the folder containing partial financial data (for debugging)

field enable_filelock: bool = False
field file_based_execution_timeout: int = 3600

Timeout in seconds for each factor implementation execution

field filelock_path: str | None = None
field knowledge_base_path: str | None = None

Path to the knowledge base

field max_loop: int = 10

Maximum number of task implementation loops

field max_seconds_multiplier: int = 1000000
field new_knowledge_base_path: str | None = None

Path to the new knowledge base

field python_bin: str = 'python'

Path to the Python binary

field select_method: str = 'random'

Method for the selection of factors implementation

field simple_background: bool = False

Whether to use simple background information for code feedback

field v2_add_fail_attempt_to_latest_successful_execution: bool = False
  • Qlib Configuration
    • The .yaml files in both the model_template and factor_template directories contain some configurations for running the corresponding models or factors within the Qlib framework. Below is an overview of their contents and roles:
      • General Settings:
        • provider_uri: Specifies the local Qlib data path, set to ~/.qlib/qlib_data/cn_data.

        • market: Configured to csi300, representing the CSI 300 index constituents.

        • benchmark: Set to SH000300, used for backtesting evaluation.

      • Data Handling:
        • start_time and end_time: Define the full data range, from 2008-01-01 to 2022-08-01.

        • fit_start_time: The start date for fitting the model, set to 2008-01-01.

        • fit_end_time: The end date for fitting the model, set to 2014-12-31.

        • features and labels: Generated via a nested data loader combining Alpha158DL (for engineered features such as RESI5, WVMA5, RSQR5, KLEN, etc.) and a StaticDataLoader that loads precomputed factor files (combined_factors_df.parquet).

        • normalization: The pipeline includes RobustZScoreNorm (with clipping) and Fillna for inference, and DropnaLabel with CSZScoreNorm for training.

      • Training Configuration:
        • Model: Uses GeneralPTNN, a PyTorch-based neural network model.

        • Dataset Splits:
          • train: 2008-01-01 to 2014-12-31

          • valid: 2015-01-01 to 2016-12-31

          • test: 2017-01-01 to 2020-08-01

      • Default Hyperparameters (can be overridden by command-line arguments):
        • n_epochs: 100

        • lr: 2e-4

        • early_stop: 10

        • batch_size: 256

        • weight_decay: 0.0

        • metric: loss

        • loss: mse

        • n_jobs: 20

        • GPU: 0 (uses GPU 0 if available)

      • Backtesting and Evaluation:
        • strategy: TopkDropoutStrategy, which selects the top 50 stocks and randomly drops 5 to introduce exploration.

        • backtest period: 2017-01-01 to 2020-08-01

        • initial capital: 100,000,000

        • cost configuration: Includes open/close costs, minimum transaction costs, and slippage control.

      • Recording and Analysis:
        • SignalRecord: Logs predicted signals.

        • SigAnaRecord: Performs signal analysis without long-short separation.

        • PortAnaRecord: Conducts portfolio analysis using the configured strategy and backtest settings.