Skip to content

ATR introduces look-ahead bias #963

Open
@EladioRocha

Description

@EladioRocha

Expected Behavior

When we trade with trailing stop loss the ATR is calculated. Fo example suppose that we calculate the ATR for 100 periods the first 100 periods should be removed because they don't have values and if you try to open a trade in the period 10 there isn't an available ATR value therefore isn't possible to determine the stop loss. So in that case all the values that contains NaN values should be removed.

Actual Behavior

In the code when the ATR is calculated the values are backward filled with the future values and according to our previous example if a trade is opened in the period 10, then the ATR used is the ATR in the period 101

def set_atr_periods(self, periods: int = 100):
    """
    Set the lookback period for computing ATR. The default value
    of 100 ensures a _stable_ ATR.
    """
    h, l, c_prev = self.data.High, self.data.Low, pd.Series(self.data.Close).shift(1)
    tr = np.max([h - l, (c_prev - h).abs(), (c_prev - l).abs()], axis=0)
    atr = pd.Series(tr).rolling(periods).mean().bfill().values  # Is look-ahead bias introduced using bfill?
    
    self.__atr = atr

Additional info

image

It maintins the ATR calculation and changes until the period 101

My code is the following

class MyStrategy(TrailingStrategy):
    size = 0.3
    atr = 3
    atr_period = 100

    def init(self):
        super().init()
        self.set_trailing_sl(n_atr=self.atr)
        self.set_atr_periods(periods=self.atr_period)

    def next(self):
        super().next()
        
        if not self.position.is_long:
            self.buy(size=self.size)

Activity

kernc

kernc commented on May 3, 2023

@kernc
Owner

True, the .bfill() call here introduces a brief look-ahead bias.

atr = pd.Series(tr).rolling(periods).mean().bfill().values

The assumption was that the ATR for a trailing strategy would be fairly stable.

PR fix welcome!

added
bugSomething isn't working
on May 3, 2023
changed the title [-]The atr introduce look ahead bias or is my interpretation incorrect?[/-] [+]ATR introduces look-ahead bias[/+] on Feb 19, 2025
bravegag

bravegag commented on Feb 23, 2025

@bravegag

A simple fix is to do a ffill before the bfill like this:

atr = pd.Series(tr).rolling(periods).mean().ffill().bfill().values 

this ensures that only the leading (oldest prices chronologically) are backfilled for the purpose of calculating HPR.

Also, you may want to consider using the min_periods argument in the rolling, otherwise and if there is missing price data, it will generate a lot of NAs along the way. For the mean function I would recommend setting it to 1 (meaning a valid non-NA mean result will be provisioned when there is at least 1 non-NA value within each window of 100 prices) so that NAs are avoided e.g.

atr = pd.Series(tr).rolling(periods, min_periods=1).mean().ffill().bfill().values 

Last but not least, possibly the f-b fills need to be moved to before the rolling mean calculation, not sure what the data is but if it is prices then it would make more sense:

atr = pd.Series(tr).ffill().bfill().rolling(periods, min_periods=1).mean().values 

I need to get more familiarity with the backtesting code before submitting PRs :)

bravegag

bravegag commented on Feb 26, 2025

@bravegag

@EladioRocha thanks a lot for reporting this. Can you please double-check my PR #1230 to make sure that it works for you now?

kernc

kernc commented on Feb 27, 2025

@kernc
Owner

this ensures that only the leading (oldest prices chronologically) are backfilled for the purpose of calculating HPR.

@bravegag As far as I understand, ATR is computed from true range, which is computed from OHLC data, which is checked beforehand to contain no NaNs ...

hi, lo, c_prev = self.data.High, self.data.Low, pd.Series(self.data.Close).shift(1)
tr = np.max([hi - lo, (c_prev - hi).abs(), (c_prev - lo).abs()], axis=0)
atr = pd.Series(tr).rolling(periods).mean().bfill().values

if data[['Open', 'High', 'Low', 'Close']].isnull().values.any():
raise ValueError('Some OHLC values are missing (NaN). '
'Please strip those lines with `df.dropna()` or '
'fill them in with `df.interpolate()` or whatever.')

Therefore, there should be no NaNs in ATR other than the leading ones! Which are bfill()-ed already, hence this bug report. 🤔

bravegag

bravegag commented on Feb 27, 2025

@bravegag

OK then if you are fine, close the issue and the PR, my fix just makes sure there won't be any LAB without making any assumptions. Also the bfill should come before the rolling. Otherwise any leading NAs will introduce exactly 99 (100-1) NA prices ahead due to the default min_periods argument to rolling; and the mean, due to the bfill, will be unnecessarily duplicated for all those steps.

kernc

kernc commented on Feb 27, 2025

@kernc
Owner

pd.Series(tr).rolling(periods) introduces periods - 1 NaNs in the front warm-up period. The issue is these (and only these NaNs from rolling) are bfilled instead of e.g. being left as NaN values. This bfilling introduces a look-ahead bias for the warm-up period, more noticeable with short ATR periods.

linked a pull request that will close this issue on Mar 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      ATR introduces look-ahead bias · Issue #963 · kernc/backtesting.py