Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty Option Contracts Returned if Invoked inside the Lean Environment (both Backtest and Research) #9

Open
efJerryYang opened this issue Feb 6, 2025 · 4 comments
Assignees

Comments

@efJerryYang
Copy link

efJerryYang commented Feb 6, 2025

Problem Description

If I directly run the test cases in the repository, everything runs smoothly. More specifically, I tried the following code in this repository:

        // ThetaDataOptionChainProviderTests.cs
        ...
        private static IEnumerable<Symbol> UnderlyingSymbols
        {
            get
            {
                TestGlobals.Initialize();
                yield return Symbol.Create("SPY", SecurityType.Equity, Market.USA); // New Code Here
                yield return Symbol.Create("XEO", SecurityType.Index, Market.USA);
                yield return Symbol.Create("DJX", SecurityType.Index, Market.USA);
            }
        }

        [TestCaseSource(nameof(UnderlyingSymbols))]
        public void GetOptionContractList(Symbol symbol)
        {
            var referenceDate = new DateTime(2024, 03, 28);
            var optionChain = _thetaDataOptionChainProvider.GetOptionContractList(symbol, referenceDate).ToList();

            // New Code Here
            if (symbol.SecurityType == SecurityType.Equity && symbol.ID.Symbol == "SPY")
            {
                // len == 8334
                Console.WriteLine($"Number of contracts for {symbol.ID.Symbol}: {optionChain.Count} on {referenceDate}");
                Assert.That(optionChain, Has.Count.EqualTo(8334));
            }

            ...
        }

And run the test with dotnet test --filter "FullyQualifiedName=QuantConnect.Lean.DataSource.ThetaData.Tests.ThetaDataOptionChainProviderTests.GetOptionContractList"

The test case runs without an error, and the API returns the expected number of contracts. Everything works as expected here.

However, when running the code in backtesting or research, things go wrong. See the following section, which indicates an issue with the integration with the Lean repository.

The code correctly sets up the DownloaderDataProvider, as expected, and the logs confirm this:

20250206 09:44:32.340 TRACE:: Log: [OnData]   type(DataCacheProvider._dataProvider): QuantConnect.Lean.Engine.DataFeeds.DownloaderDataProvider
20250206 09:44:32.340 TRACE:: Log: [OnData]   type(_dataProvider._dataDownloader): QuantConnect.Lean.DataSource.ThetaData.ThetaDataDownloader

The ThetaDataDownloader is also correctly set in the DownloaderDataProvider, but it is likely not being called, and all the data returned is only fetched from the local disk.

Note: I didn't run lean download to fetch local data. My goal is to use Lean CLI to access the ThetaData API directly instead of QuantConnect's data source.

I have also examined Lean/Engine/HistoricalData/SubscriptionDataReaderHistoryProvider.cs, where the GetHistory method does not return meaningful data. It is invoked from BacktestingChainProvider.GetOptionSymbols.


Alex mentioned that this issue was caused by changes to the option contract fetching logic made four months ago. I have identified the relevant commit in the Lean repository here: QuantConnect/Lean@16c4259, along with some subsequent commits. However, simply mimicking the changes in that commit does not resolve my issue. For example, in that commit:

diff --git a/Algorithm.CSharp/AddAndRemoveOptionContractRegressionAlgorithm.cs b/Algorithm.CSharp/AddAndRemoveOptionContractRegressionAlgorithm.cs
index 0a8cce770..4ae44e2c4 100644
--- a/Algorithm.CSharp/AddAndRemoveOptionContractRegressionAlgorithm.cs
+++ b/Algorithm.CSharp/AddAndRemoveOptionContractRegressionAlgorithm.cs
@@ -40,8 +40,8 @@ namespace QuantConnect.Algorithm.CSharp
 
             var aapl = QuantConnect.Symbol.Create("AAPL", SecurityType.Equity, Market.USA);
 
-            _contract = OptionChainProvider.GetOptionContractList(aapl, Time)
-                .OrderBy(symbol => symbol.ID.Symbol)
+            _contract = OptionChain(aapl)
+                .OrderBy(x => x.ID.Symbol)
                 .FirstOrDefault(optionContract => optionContract.ID.OptionRight == OptionRight.Call
                     && optionContract.ID.OptionStyle == OptionStyle.American);
             AddOptionContract(_contract);

And if I use the following code directly, the contract count is still 0:

var contracts = OptionChain(this.underlying).ToList(); // 0 contracts

This suggests an issue with the outdated integration in this repository. I am wondering if you can help resolve this issue or at least you may direct me to a 'correctly updated' DataSource repository, so that I can understand what's the correct way of integrating with the Lean Engine and make a fix for here.

Thanks!


For convenience, I have provided a minimal project to reproduce the issue: https://github.com/efJerryYang/ThetaDataWithLeanEmptyResultDemo

I used a fixed Lean Engine version for reproducibility:

❯ lean backtest 'Jumping Red-Orange Penguin' --image quantconnect/lean:16920 --data-provider-historical ThetaData --thetadata-subscription-plan Standard

And the logs would be like:

20250206 09:44:32.335 TRACE:: Debug: Accurate daily end-times now enabled by default. See more at https://qnt.co/3YHaWHL. To disable it and use legacy daily bars set 
self.settings.daily_precise_end_time = False.
20250206 09:44:32.335 TRACE:: StopSafely(): Waiting for 'Result Thread' thread to stop...
20250206 09:44:32.335 TRACE:: Debug: Warning: The following securities were set to raw price normalization mode to work with options: SPY...
20250206 09:44:32.335 TRACE:: Log: [Initialize] Start Date: 3/28/2024 12:00:00 AM
20250206 09:44:32.336 TRACE:: Log: [Initialize] End Date: 3/28/2024 12:00:00 AM
20250206 09:44:32.336 TRACE:: Log: [Initialize] Cash: 100000.0
20250206 09:44:32.336 TRACE:: Log: [Initialize] Before AddEquity: 0
20250206 09:44:32.336 TRACE:: Log: [Initialize] After AddEquity: 1
20250206 09:44:32.336 TRACE:: Log: [Initialize] Before AddOption: 1
20250206 09:44:32.336 TRACE:: Log: [Initialize] After AddOption: 2
20250206 09:44:32.336 TRACE:: Log: [OnData] ======================== OnData [Start] ========================
20250206 09:44:32.336 TRACE:: Log: [OnData] Time: 3/28/2024 10:00:00 AM
20250206 09:44:32.336 TRACE:: Log: [OnData] type(OptionChainProvider): QuantConnect.Lean.Engine.DataFeeds.CachingOptionChainProvider
20250206 09:44:32.337 TRACE:: Log: [OnData] Available contracts by OptionChainProvider.GetOptionContractList(underlying, ...): 0
20250206 09:44:32.337 TRACE:: Log: [OnData]   ------------------- Reflection [Start] -------------------
20250206 09:44:32.337 TRACE:: Log: [OnData]   type(OptionChainProvider._optionChainProvider): QuantConnect.Lean.Engine.DataFeeds.BacktestingOptionChainProvider
20250206 09:44:32.337 TRACE:: Log: [OnData]   type(_optionChainProvider.DataCacheProvider): QuantConnect.Lean.Engine.DataFeeds.ZipDataCacheProvider
20250206 09:44:32.337 TRACE:: Log: [OnData]   type(DataCacheProvider._dataProvider): QuantConnect.Lean.Engine.DataFeeds.DownloaderDataProvider
20250206 09:44:32.337 TRACE:: Log: [OnData]   type(_dataProvider._dataDownloader): QuantConnect.Lean.DataSource.ThetaData.ThetaDataDownloader
20250206 09:44:32.337 TRACE:: Log: [OnData]   ------------------- Reflection [End] -------------------
20250206 09:44:32.337 TRACE:: Log: [OnData] ======================== OnData [End] ========================
...
20250206 09:44:32.342 TRACE:: Log: [OnData] ======================== OnData [Start] ========================
20250206 09:44:32.342 TRACE:: Log: [OnData] Time: 3/28/2024 4:00:00 PM
20250206 09:44:32.342 TRACE:: Log: [OnData] type(OptionChainProvider): QuantConnect.Lean.Engine.DataFeeds.CachingOptionChainProvider
20250206 09:44:32.342 TRACE:: Log: [OnData] Available contracts by OptionChainProvider.GetOptionContractList(underlying, ...): 0
20250206 09:44:32.342 TRACE:: Log: [OnData]   ------------------- Reflection [Start] -------------------
20250206 09:44:32.342 TRACE:: Log: [OnData]   type(OptionChainProvider._optionChainProvider): QuantConnect.Lean.Engine.DataFeeds.BacktestingOptionChainProvider
20250206 09:44:32.343 TRACE:: Log: [OnData]   type(_optionChainProvider.DataCacheProvider): QuantConnect.Lean.Engine.DataFeeds.ZipDataCacheProvider
20250206 09:44:32.343 TRACE:: Log: [OnData]   type(DataCacheProvider._dataProvider): QuantConnect.Lean.Engine.DataFeeds.DownloaderDataProvider
20250206 09:44:32.343 TRACE:: Log: [OnData]   type(_dataProvider._dataDownloader): QuantConnect.Lean.DataSource.ThetaData.ThetaDataDownloader
20250206 09:44:32.343 TRACE:: Log: [OnData]   ------------------- Reflection [End] -------------------
20250206 09:44:32.343 TRACE:: Log: [OnData] ======================== OnData [End] ========================
20250206 09:44:32.343 TRACE:: Debug: Algorithm Id:(1256374201) completed in 0.19 seconds at 0k data points per second. Processing total of 22 data points.
20250206 09:44:32.343 TRACE:: Debug: Your log was successfully created and can be retrieved from: /Results/1256374201-log.txt
20250206 09:44:32.343 TRACE:: BacktestingResultHandler.Run(): Ending Thread...
20250206 09:44:32.381 TRACE::
@andywatts
Copy link

andywatts commented Feb 16, 2025

@efJerryYang "closed as completed"?

Within a backtest I'm able to download Index data, but not IndexOption. Suspect it's the same issue.

fwiw this downloads quotes correctly, but no way to prefilter the strikes, so gets too big at small resolutions.

lean data download --data-provider-historical ThetaData --data-type Quote --resolution Minute --security-type IndexOption --ticker SPXW --start 20250127 --end 20250128 --thetadata-subscription-plan Pro --market usa

@efJerryYang
Copy link
Author

@andywatts Hi, as we have discussed in discord, but I also quoted here for other's reference.

Using third-party provider requires us to prepare the Universe Dataset ourselves, which can be done inside the lean docker image (we need to call the Symbol.Create using the environment provided by the container to generate the #symbol_id).

Avoid using their downloader if you are to fetch years of data, which was using sync request with a single thread and it would take forever to download merely the OpenInterest data (estimated to be about 12x24 hours...). In my attempt, with 32 threads and roughly 256 concurrent connections it will only take several minutes for the OpenInterest, couple of hours to download the Quote data and convert to lean-compatible format.

And there are still many other issues may encountered in local development...

@Martin-Molinero
Copy link
Member

Sorry for the delay, but please do report any bugs and how to reproduce so we fix them asap.

@efJerryYang
Copy link
Author

efJerryYang commented Feb 17, 2025

Hi, thanks for the update.

I want to clarify this is not a bug but a misunderstanding of the dataset requirements. Users need to prepare their Universe Dataset in lean-compatible format for running the backtest/research/live via ThetaData. Please close the issue if you believe this is not in the scope of QuantConnect support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants