Skip to content

Source code for the AAAI 2025 paper "TimeCAP: Learning to Contextualize, Augment, and Predict Time Series Events with Large Language Model Agents."

Notifications You must be signed in to change notification settings

geon0325/TimeCAP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

32 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

TimeCAP

Code and datasets of TimeCAP: Learning to Contextualize, Augment, and Predict Time Series Events with Large Language Model Agents, (AAAI 2025).

  • The arXiv version can be found here.
  • The supplementary document can be found here.

The code for the multi-modal encoder is available upon request: geonlee0325@kaist.ac.kr.

Time Series Datasets

We release seven time series datasets from three different domains, which can be found here.

  • 🌀️ Weather: weather_ny (New York), weather_sf (San Francisco), and weather_hs (Houston)
  • πŸ’° Finance: finance_sp500 (S&P 500) and finance_nikkei (Nikkei 225)
  • πŸ₯ Healthcare: healthcare_mortality (Mortality Rate) and healthcare_positive (Test-Positive Rate)

You can load the dataset as follows (e.g., weather datasets):

import pickle as pkl

with open('indices.pkl', 'rb') as f:
    indices = pkl.load(f)

with open(f'time_series_{city}.pkl', 'rb') as f:
    data = pkl.load(f)

You can load the labels of time series events as follows (e.g., weather datasets):

with open(f'rain_{city}.pkl', 'rb') as f:
    labels = pkl.load(f)

# weather
0: not rained / 1: rained

# finance
0: decreased / 1: neutral / 2: increased

# healthcare
0: did not exceed the average / 1: exceeded the average

The total number of events is given by len(indices). The correspondence between the time series and labels is as follows:

for _i in range(len(indices)):
    i = indices[_i]
    # Time series
    time_series = data[i:i+window_size]
    # Label
    label = labels[_i]

Each dataset directory contains:

  • gpt_summary: Textual summaries of time series generated by GPT-4.
  • gpt_predict_time: Predictions generated by GPT-4 based on time series.
  • gpt_predict_text: Predictions generated by GPT-4 based on textual summaries (TimeCP).
  • gpt_predict_in-context: Predictions generated by TimeCAP.

P1. Contextualization of Time Series

We contextualize time series using LLM. The prompt of each dataset is as follows:

  • 🌀️ Weather
# System Prompt
Your job is to act as a professional weather analyst. You will write a high-quality report that is informative and helps in understanding the current weather situation.

# User Prompt
Your task is to analyze key weather indicators in {city_name} over the last {window_size} hours. Review the time-series data provided for the last {window_size} hours. Each time-series consists of hourly values separated by a \'|\' token for the following indicators:
- Temperature (Kelvin): {temperature}
- Humidity (%): {humidity}
- Air Pressure (hPa): {pressure}
- Wind Speed (m/s): {wind_speed}
- Wind Direction (degrees): {wind_direction}
Based on this time-series data, write a concise report that provides insights crucial for understanding the current weather situation. Your report should be limited to five sentences, yet comprehensive, highlighting key trends and considering their potential impact on the weather in {city_name}. Do not write numerical values while writing the report.
  • πŸ’° Finance
# System Prompt
Your job is to act as a professional finance analyst. You will write a high-quality report that is informative and helps in understanding the current financial situation.

# User Prompt
Your task is to analyze key financial indicators over the last {window_size} market days. Review the time-series data provided for the last {window_size} market days. Each time-series consists of daily values separated by a \'|\' token for the following indicators:
- S&P 500: {s_p_500}
- VIX (Volatility Index): {vix}
- Nikkei 225: {nikkei_225}
- FTSE 100: {ftse_100}
- Gold Futures: {gold_futures}
- Crude Oil Futures: {crude_oil_futures}
- Exchange rate for EUR/USD: {eur_usd}
- Exchange rate for USD/JYP: {usd_jpy}
- Exchange rate for USD/CNY: {usd_cny}
Based on this time-series data, write a concise report that provides insights crucial for understanding the current financial situation. Your report should be limited to five sentences, yet comprehensive, highlighting key trends and considering their potential impact on the market. Do not write numerical values while writing the report.
  • πŸ₯ Healthcare
# System Prompt
Your job is to act as a professional healthcare analyst. You will write a high-quality report that is informative and helps understand the current healthcare situation.

# User Prompt
Your task is to analyze the respiratory specimens testing positive for influenza over the last {window_size} weeks. The average ratio of positive speciemens is 6.26%. Review the time-series data provided for the last {window_size} weeks. Each time-series consists of weekly values separated by a \'|\' token for the following indicators:
- Number of specimens tested: {total_specimens}
- Number of positive specimens for Influenza A: {total_a}
- Number of positive specimens for Influenza B: {total_b}
- Ratio of positive specimens (%): {pos_rate}
- Ratio of positive specimens for Influenza A (%): {a_rate}
- Ratio of positive specimens for Influenza B (%): {b_rate}
Based on this time-series data, write a concise report that provides insights crucial for understanding the current healthcare situation. Your report should be limited to five sentences, yet comprehensive, highlighting key trends and considering their potential impact on the healthcare system. Do not write redundant information.

P2. Prediction Based on Time Series

We predict time series events using time series as inputs. The prompt of each dataset is as follows:

  • 🌀️ Weather
# System Prompt
Your job is to act as a professional weather forecaster. You will be given a time-series data of the weather from the past 24 hours. Based on this information, your task is to predict whether it will rain in the next 24 hours.

# User Prompt
Your task is to predict whether it will rain or not in {city_name} in the next {window_size} hours. Review the time-series data provided for the last {window_size} hours. Each time-series consists of hourly values separated by a \'|\' token for the following indicators:
- Temperature (Kelvin): {temperature}
- Humidity (%): {humidity}
- Air Pressure (hPa): {pressure}
- Wind Speed (m/s): {wind_speed}
- Wind Direction (degrees): {wind_direction}
Based on this information, respond with either \'rain\' or \'not rain\'. Do not provide any other details.
  • πŸ’° Finance
# System Prompt
Your job is to act as a professional financial forecaster. You will be given a time-series data from the past 20 market days. Based on this information, your task is to predict whether the {indicator_name} price will decrease by more than 1%, increase by more than 1%, or change minimally in the next market day.

# User Prompt
Your task is to predict whether the {indicator_name} price will: (1) Decrease: decrease by more than 1% (2) Increase: increase by more than 1% (3) Neutral: change minimally, between -1% to 1%\nin the next market day. Review the time-series data provided for the last {window_size} market days. Each time-series consists of daily values separated by a \'|\' token for the following indicators:
- S&P 500: {s_p_500}
- VIX (Volatility Index): {vix}
- Nikkei 225: {nikkei_225}
- FTSE 100: {ftse_100}
- Gold Futures: {gold_futures}
- Crude Oil Futures: {crude_oil_futures}
- Exchange rate for EUR/USD: {eur_usd}
- Exchange rate for USD/JYP: {usd_jpy}
- Exchange rate for USD/CNY: {usd_cny}
Based on this information, predict whether the {indicator2name[indicator]} price will decrease by more than 1%, increase by more than 1%, or otherwise, in the next market day. Respond with either \'decrease\', \'increase\', or \'neutral\'. Do not provide any other details. 
  • πŸ₯ Healthcare
# System Prompt
Your job is to act as a professional healthcare forecaster. You will be given a time-series data from the past 20 weeks. Based on this information, your task is to predict whether the ratio of mortality from Influenza or Pneumonia to the total number of death will exceed its average in the comming week.

# User Prompt
Your task is to predict whether the percentage of respiratory specimens testing positive for influenza will: (1) Exceed its average of 6.26% (2) Not exceed its average of 6.26% in the coming week. Review the time-series data provided for the last {window_size} weeks. Each time-series consists of weekly values separated by a \'|\' token for the following indicators:"
- Number of specimens tested: {total_specimens}
- Number of positive specimens for Influenza A: {total_a}
- Number of positive specimens for Influenza B: {total_b}
- Ratio of positive specimens (%): {pos_rate}
- Ratio of positive specimens for Influenza A (%): {a_rate}
- Ratio of positive specimens for Influenza B (%): {b_rate}
Based on this time-series data, predict whether the percentage of respiratory specimens testing positive for influenza will exceed its average of 6.26% or not in the comming week. Respond with either \'exceed\' or \'not exceed\'. Do not provide any other details.

P3. Prediction Based on Text

We predict time series events using text (generated by LLMs above) as inputs. The prompt of each dataset is as follows:

  • 🌀️ Weather
# System Prompt
Your job is to act as a professional weather forecaster. You will be given a summary of the weather from the past 24 hours. Based on this information, your task is to predict whether it will rain in the next 24 hours.

# User Prompt
Your task is to predict whether it will rain or not in {city_name} in the next {window_size} hours. The weather of the past 24 hours is summarized as follows:
{TEXT}
Based on this information, respond with either \'rain\' or \'not rain\'. Do not provide any other details. 
  • πŸ’° Finance
# System Prompt
Your job is to act as a professional financial forecaster. You will be given a financial summary of the past 20 market days. Based on this information, your task is to predict whether the {indicator_name} price will decrease by more than 1%, increase by more than 1%, or change minimally in the next market day.

# User Prompt
Your task is to predict whether the {indicator_name} price will: (1) Decrease: decrease by more than 1% (2) Increase: increase by more than 1% (3) Neutral: change minimally, between -1% to 1%\nin the next market day. The financial situation of the last {window_size} market days is summarized as follows:
{TEXT}
Based on this information, predict whether the {indicator_name} price will decrease by more than 1%, increase by more than 1%, or otherwise (neutral), in the next market day. Respond with either \'decrease\', \'increase\', or \'neutral\'. Do not provide any other details. 
  • πŸ₯ Healthcare
# System Prompt
Your job is to act as a professional healthcare forecaster. You will be given a healthcare summary of the past 20 weeks. Based on this information, your task is to predict whether the percentage of respiratory specimens testing positive for influenza will exceed the average threshold in the comming week.

# User Prompt
Your task is to predict whether the percentage of respiratory specimens testing positive for influenza will: (1) Exceed its average of 6.26% (2) Not exceed its average of 6.26% in the coming week. The healthcare situation of the last {window_size} weeks is summarized as follows:
{TEXT}
Analyze this summary and predict whether the percentage of respiratory specimens testing positive for influenza will exceed the average of 6.26% or not. Respond with either \'exceed\' or \'not exceed\'. Do not provide any other details.

P4. Prediction of TimeCAP

We predict time series events using text (generated by LLMs above) with in-context examples as inputs. The prompt of each dataset is as follows:

  • 🌀️ Weather
# System Prompt
Your job is to act as a professional weather forecaster. You will be given a summary of the weather from the past 24 hours. Based on this information, your task is to predict whether it will rain in the next 24 hours.

# User Prompt
Your task is to predict whether it will rain or not in {city_full_name[city]} in the next {window_size} hours. 
First, review the following {k} examples of weather summaries and outcomes so that you can refer to when making predictions.
{In-context example 1: Text & Output}
...
{In-context example k: Text & Output}
The weather of the last 24 hours is summarized as follows:
{TEXT}
Based on the understanding of the provided examples, predict the outcome of the current weather summary. Respond your prediction with either 'rain' or 'not rain'. Response should not include other terms.
  • πŸ’° Finance
# System Prompt
Your job is to act as a professional financial forecaster. You will be given a summary of the financial situation of the past 20 market days. Based on this information, your task is to predict whether the {indicator_name} price will decrease by more than 1%, increase by more than 1%, or change minimally in the next market day.

# User Prompt
Your task is to predict whether the {indicator_name} price will: (1) Decrease: decrease by more than 1% (2) Increase: increase by more than 1% (3) Neutral: change minimally, between -1% to 1%\nin the next market day.
First, review the following {k} examples of financial summaries and {indicator2name[indicator]} outcomes so that you can refer to when making predictions.
{In-context example 1: Text & Output}
...
{In-context example k: Text & Output}
The financial situation of the last {window_size} market days is summarized as follows:
{TEXT}
Refer to the provided examples and predict the outcome of the current financial summary. Respond your prediction with either 'decrease', 'increase' or 'neutral'. Response should not include other terms.
  • πŸ₯ Healthcare
# System Prompt
Your job is to act as a professional healthcare forecaster. You will be given a healthcare summary of the past 20 weeks. Based on this information, your task is to predict whether the percentage of respiratory specimens testing positive for influenza will exceed the average threshold in the comming week.

# User Prompt
Your task is to predict whether the percentage of respiratory specimens testing positive for influenza will: (1) Exceed its average of 6.26% (2) Not exceed its average of 6.26% in the coming week.
First, review the following {k} examples of healthcare summaries and their outcomes so that you can refer to when making predictions.
{In-context example 1: Text & Output}
...
{In-context example k: Text & Output}
The healthcare situation of the last {window_size} weeks is summarized as follows:
{TEXT}
Refer to the provided examples and predict the outcome of the current healthcare summary. Respond with either \'exceed\' or \'not exceed\'. Response should not include other terms.

About

Source code for the AAAI 2025 paper "TimeCAP: Learning to Contextualize, Augment, and Predict Time Series Events with Large Language Model Agents."

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published