The AI-regon Trail
Oregon Trail (1988) MS-DOS Computer Game

The AI-regon Trail

An epic adventure building ethical, secure, and explainable AI/ML models in the financial industry – includes a hands-on tutorial written in the Python programming language!

Key Terms: Artificial Intelligence (AI), Machine Learning (ML), Financial Modeling, Risk Management, Python, XGBoost, Generative AI

There’s gold in those hills!

Financial leaders across the world have thrown down a challenge to their quant teams – incorporate AI/ML into the firm’s business analytics before it is too late. “It’s a gold rush and the winner will make a fortune! AI is the future - we can’t be left behind!” For modeling teams, this creates both a challenge and an opportunity. The opportunity is clear. AI/ML tools are readily available and could have a transformative effect on corporate life. However, there are many challenges too.

In some ways, this is like the American gold rush of the late 1840s. The discovery of gold in California spawned a massive migration to the west coast of the United States in the late 1840s. Many obstacles faced settlers who moved west trying to benefit from this gold rush. Imposing mountains, hostile strangers, and dreaded diseases like dysentery all challenged those looking to make their fortune.

Too many shovels, and not enough gold mines

During some part of our lives, everyone has probably heard the saying, “During a gold rush, sell shovels”. Historically, in the gold rush of 1849, this was a good way to make money. There were many more people who wanting to be miners than salesmen. However, whenever there are more shovel salesmen than miners, this guidance needs to shift. It’s an issue of relative scarcity (See Figure - Supply and Demand).

Article content
Figure - Supply and Demand

That’s not to say that developing new AI tools is a bad thing. However, it is important to keep in mind the current balance between supply and demand for these tools. Some days, business problems will be well known and solutions scarce. The technical aspects will become more important under those conditions. On other days, solutions will abound when problems are scarce. In those situations, softer skills – like being able to work collaboratively with front-office teams – is the key to success.

For financial modeling teams, this requirement of matching tools to financial problems is one of the core issues faced by a financial-industry AI/ML team. It requires flexibility to switch between being a technical expert and working collaboratively with end users to develop project plans. Along with the need to understand how to use rapidly evolving AI/ML tools, it is necessary for financial modeling teams to identify actual business needs to address. A typical breakdown of time spent in an AI/ML team looks like the following:

  • 10%. Learning about AI/ML new tools
  • 5%. Identifying appropriate AI/ML tools and evaluating vendor solutions.
  • 30%. Identifying tasks that need AI/ML support
  • 20%. Getting data to analyze
  • 5%. Implementing the model.
  • 10%. Testing my model for reasonability
  • 10%. Writing model documentation

A hammer in search of a nail

Professional AI/ML modelers need to stay up to date with the latest technology. There are many types of AI/ML technologies. Those technologies change every day. Each technology is radically different from other technologies. They evolve and change all the time. These technologies serve different purposes. For example, certain AI/ML tools are good at summarizing data like news reports and quarterly earnings reports. Other technologies are good at automatically fitting parameters to time-series studies. Still others are good at group data points based on some type of similarity.

Staying up to date is a necessary, and important, part of a modeling job. Reliance on a single technology is self-self-limiting. The quote "If you only have a hammer, you tend to see every problem as a nail" is attributed to Abraham Maslow. Becoming over-reliant on a single tool can hinder problem-solving.

Another version of the hammer quote is "To a man with a hammer, everything looks like a nail". This pretty much summarizes the John Henry story (See Figure - Statue of John Henry). John Henry, the “steel-driving man”, is a popular American folk hero. According to the legend, he tried to save railroad laborers' jobs by competing against a steam-powered hammer. He won by driving in more spikes than the steam-powered machine. However, once it was over, he collapsed and died.

Article content
Figure - Statue of John henry

Figure 2 – Statue of John Henry

This is a good warning about the dangers of being complacent and not adapting to technological changes. To be able to solve problems, it is important to understand what type of tools are available. Later in this presentation, we are going to look in-depth at two types of technologies:

  1. XGBoost. This is a machine-learning technique that we will be using to automatically fit parameters for a time-series regression.
  2. Chat-GPT.  This is a generative AI tool that we will be using to summarize news reports.

A nail in search of a hammer

The biggest single trap that a modeling group can fall into is building a lot of neat technology while hoping that someone else can suggest a use for it. This is self-limiting – it’s the type of thinking that leads to mass layoffs in a modeling team. The basic issue is that end users (the people with problems) often don’t know that there are tools available to assist them. They’ve already had to figure out some type of workaround and they don’t really have the time to stay abreast of the constantly changing AI/ML space. This creates a barrier that modeling teams have help front-office teams cross successfully (See Figure - Oregon Trail, Crossing a River).

Article content
Figure - Oregon Trail, Crossing a River

It is necessary to both understand which AI/ML tools are available and help business teams identify issues where those tools would be useful. As the leader of an AI/ML team, defining worthwhile projects is my single biggest challenge in my day. If I have a well-defined problem – a box that needs coloring in - I can always buy a box of crayons and hire a consultant to fill in the blanks. However, If I don’t know what problem to solve, no amount of technology or number of great mathematicians can help me.

If I have a well-defined problem – a box that needs coloring in - I can always buy a box of crayons and hire a consultant to fill in the blanks. If I don’t know what problem to solve, no amount of technology or number of great quants will help.

Successful AI/ML modeling teams will often spend twice as much time identifying new projects as they do to keep their skills up to date. It’s a huge part of the job. In this presentation, we will develop AI/ML software that will help business groups write a 30-second elevator speech explaining what happened to some line of business using financial market news stories. There are two main components:

Project Goal: Automatically create a 30-second elevator speech to describe what happened to some line of business based on market news for the day.

Table Stakes at the Big Game

The financial industry is heavily regulated. Any type of AI/ML model needs to meet some basic standards. First, any model will probably need to conform to model risk standards like the Federal Reserve’s SR 11-7. This means it has to be documented and tested. Secondly, models will need ethically developed (can’t steal other people’s data!), can’t expose confidential data outside the firm, and will have to be verifiable by humans. These are basically table stake needed to get into the big game (See Figure – Table Stakes at the Big Game).

Article content
Figure - Casino in Tombstone, Arizona

Some common AI/ML standards include:

  • Ethically developed. Financial AI models can’t use data whose owners didn’t approve that usage. For example, financial firms can’t use AI models based on stolen data or data scraped from public sources without permission.
  • Data is Secure, Private, and Protected. A firm’s private data should be sent outside the firm. This includes prompts sent to third-party data servers like Chat-GPT or Meta-AI.
  • Transparent and explainable. Using Generative AI is like asking a random member of a pool of interns to do write a report. You might get the all-star. You might get the dud of the class. Either way, someone responsible will need to make sure any report is reasonable.
  • Tested and documented. Models take a couple of weeks to develop and can stay in for decades. Anything that can be done to make future maintenance easier is a clear win for a company.

Project Guidelines

We are going to spend the rest of this presentation going through an example of a financial model that uses both machine learning and generative AI technologies. This model will consist of the following

  1. Factor Based Model. Create a factor-based model using time-series regression. A machine learning tool, XGBoost, will be used to identify model coefficients.
  2. Explainable Factors. The factors will relate to observable real-world phenomena like changes in interest rates, the stock market, or industry exposure. The factors will allow the model to give meaningful statements like “The portfolio gained $150,000 because the stock portfolio was up 2% today.
  3. Generative AI provides commentary. Chat-GPT, will be used to explain what caused the change. For example, “Strong earnings reports in the health-care industry caused the S&P 500 to rise 2% today”.

The goal of these steps will be to automate the creation of a 30-second elevator speech to describe what happened to some line of business based on market news for the day. The primary constraints are that the analysis is ethically sourced, secure, and verifiable by humans. This is the type of analysis that many business teams would benefit by having on hand every day.

As mentioned previously, identifying good use-cases for AI/ML technologies is one of the longest and most involved portions of any AI/ML project. We are kind of glossing over that work, since we want an example that shows a couple of different technologies. However, we want to discuss a bit about how an AI/ML project is created.

First, someone needs to understand what various AI/ML tools can be used. For example, Generative AI models are good at summarizing large amounts of free-form data. This could be summarizing news reports, earnings calls, call center interactions, or even complex legal contracts. Alternately, certain machine learning models, like XGBoost, are great at automatically fitting parameters into time-series forecasts.

After that, it is necessary to talk to business groups. For example, someone might ask a co-worker, “Can you give me an example of some annoying tasks that you wish could be offloaded to someone else?”. This is a low-stakes query that can give a quick win to make the front-office team happy. The high-stakes projects can wait until everyone is comfortable with using AI/ML for low-stakes projects.

For the previously designed example, there are a large number factor-based models. These models typically have a form like:

Formula: (Predicted Change in Value) = A (Observed Change in X) + B (Observed Change in Y) + C (Observed Change in Z) + Noise()

The parameters to this type of model are the A, B, and C coefficients that get multiplied by the observed changes in value. These coefficients represent the sensitivity of the predicted value to changes in observed values. Some very common examples of this type of model include bond duration and stock betas. For example, a stock with a beta of 2.0 would be expected to be twice as volatile as some stock index (like the S&P 500). A 4% increase in the stock price would be expected if the index were to rise 2%.

  • Bond Duration. Change in Bond Price = A * (Change in Interest Rates)
  • Stock Beta. Change in Stock Price = B * (Change in S&P 500)

This type of data is typically fit using a time-series regression model. Some other examples of factor-based models include:

  • Value at Risk (VAR) models. Many market risk teams use factor-based risk models whose factors align to real-world values like interest rates, stock prices, or option volatility. They use these models to predict portfolio returns under various market conditions.
  • Counterparty Credit Reports. Counterparty credit risk teams often create financial statement models for each of a company’s major counterparties. The key element in these models is often the expected growth rate of a counterparty’s earnings.
  • Capital Adequacy (DFAST/CCAR). Both banks and insurance companies need to keep a certain amount of capital in safe investments to cover unexpected cashflows. The office of the CFO will often base their capital adequacy models, like DFAST and CCAR stress testing, on factors that tie back to observable data points.

Whenever the financial markets show unexpected turmoil, these groups are commonly faced with panicked demands from senior executives demanding constant updates on how recent market volatility is affecting the firm’s business. Preparing summaries for senior management takes up a lot of bandwidth for these teams. It’s something that would be well suited for automation using AI/ML tools.

Creating a Factor Model with XGBoost

The first part of our example is to create a factor-based model. We want this model to give us information like, “if interest rates drop 1%, our portfolio will decline in value by 3%”. Combining these details with generative AI, we would then be able to create an after-action report saying something like “Our portfolio lost money today because [Ask Chat-GPT why interest rates fell 1% today].”

Easily accessible implementations of machine learning toolkits, like XGBoost, have made machine-learning available to the masses. Even a decade ago, machine learning tasks, like regression and classification, were limited to strong math/programming teams willing to customize complicated software. One of the most used software languages for data science is Python. It is free to use and has a huge number of data science toolkits. Similar work can be done in other languages including Java, R, S+, SAS, and Matlab, although these might require some additional cost.

In this example, chosen strictly because the dataset is included in the standard Python install package, the model is trying to predict the world consumption of copper (y) as a function of five other variables (X1 to X5).

  • y.    WORLDCONSUMPTION - World consumption of copper (in 1000 metric tons)
  • X1. COPPERPRICE - Constant dollar adjusted price of copper
  • X2. INCOMEINDEX - An index of real per capita income (base 1970)
  • X3. ALUMPRICE - The price of aluminum
  • X4. INVENTORYINDEX - A measure of annual manufacturer inventory trend
  • X5. TIME - A time trend

The code will (1) import some libraries, (2) load some data, (3) split the data into training and validation testing sets, (4) run the XGBoost calculation to automatically fit the model, then (5) print out the fitted parameters. As you can see from the code example, it doesn’t take a lot of code these days to implement machine learning model.

# -----------------------
# Example XGBoost Parameter Fitting

# -----------------------
# 1. Load Libraries
from statsmodels import api as sm
from sklearn.model_selection import train_test_split

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import warnings
import xgboost as xgb

warnings.filterwarnings("ignore")

# -----------------------
# 2. Load Data
copper = sm.datasets.copper.load_pandas()
X = copper.exog
y = copper.endog

# -----------------------
# 3. Split the data in training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1)

# -----------------------
# 4. Create regression matrices
dtrain_reg = xgb.DMatrix(X_train, y_train, enable_categorical=True)
dtest_reg = xgb.DMatrix(X_test, y_test, enable_categorical=True)

params = {"objective": "reg:squarederror", "tree_method": "gpu_hist"}
n = 5000
evals = [(dtrain_reg, "train"), (dtest_reg, "validation")]

model = xgb.train(
   params=params,
   dtrain=dtrain_reg,
   num_boost_round=n,
   evals=evals,
   verbose_eval=1,
   early_stopping_rounds=3,
)

# -----------------------
# 5. Print the model coefficients
print(model.get_fscore())        

When this analysis was set up, the objective was set to minimize the root mean squared error (RMSE) of the residual. This is a fairly typical objective for this type of analysis. The residual is the difference between the predictive value (y) and the observed values (X). The RMSE squares each observation and then sums them up. It then calculates the square root of the sum.

When the model runs, it will attempt to modify the parameters X1 through X5 to minimize the RMSE on the training dataset. Then, it will calculate the RMSE on the validation dataset. The ultimate goal is to be able to generalize information learned on the training dataset to make predictions on the validation dataset. Looking at the outputs, the RMSE of the validation dataset stops decreasing after a while. When that happens, further calculations are just overfitting the training set without giving any better generalizations about the validation dataset (See Figure - XGBoost Model Fitting).

Article content
Figure - XGBoost Model Fitting

Finally, it’s possible to get parameters out of the model (See Figure – Model Outputs). For example, the coefficient for copper prices is positive. This indicates that total demand for copper is positively correlated with copper prices. Alternatively, it might be reasonable to say that both copper demand and copper prices are positively correlated with global economic growth, and that if we see prices rising, this indicates that consumers are buying more copper.


Article content
Figure - Model Outputs

In a real test, we would make a couple of modifications to the code. To keep the example concise, a couple of important steps were glossed over.

  1. Differencing. For most time series forecasting, we want to prove a causation relationship (changes in X result in a change in y) rather than correlation relationships (a high level in X is correlated with a high level in y). To do that, we would modify our y and X variables to compare changes in values rather than absolute levels.
  2. Stationarity of Residuals. We would want to verify that the residuals behaved consistently across the training and validation sets. This is typically done by stationarity testing like running tests like Augmented Dickey-Fuller (ADF) or KPSS tests. These tests will examine things like whether the mean residual stays consistent over time (preferably zero) with a constant variation around the mean. If these tests fail, the model probably can’t be trusted.
  3. Limit the Number of Variables. This dataset only has 25 time periods. That’s only enough data to support a single variable. As a rule of thumb, for a statistically meaningful result, you can only support one explanatory variable with less than 30 datapoints. After that, every 10 additional observations will allow you to add an explanatory variable. As a result, we should modify the input data to find the most meaningful explanatory variable (X) and exclude the other explanatory variables until we can acquire more data.

Creating a Summary with Generative AI

Once we have fitted a factor model, like expected global copper demand as a function of copper prices, we could watch the financial markets to check recent copper prices. Using that data, we could predict how much global demand is rising or falling. To finish our report, we might automatically pull in an explanation of why the markets were moving that day. This would allow the firm to automatically generate interesting analytics for decision makers (See Figure – Example of AI Generated Report).


Article content
Figure - Example of AI Generated Report

This example assumes you have access to the OpenAI API and have installed the openai Python package.

import openai

# Set your OpenAI API key
openai.api_key = 'your-api-key-here'

# Define a prompt
prompt = "Why did COMEX copper prices decline today?” 

# Generate a response
response = openai.Completion.create(
    engine="text-davinci-003",
    prompt=prompt,
    max_tokens=100
)

# Print the generated text
print(response.choices[0].text.strip())        

In a real scenario, we would make a couple of modifications to the code. To keep the example concise, a couple of important steps were glossed over. First, we would want to evaluate a couple of different providers. Different models may work better for certain types of information. Different providers will also have different approaches to ethical development and security.

  1. Ethically developed. If we don’t like how Chat-GPT calibrated their model, we might go with a different vendor like Bloomberg, Factset, or Reuters. These vendors all offer solutions similar to Python APIs that would function in a similar way to this example.
  2. Data is Secure, Private, and Protected. Different vendors will have different approaches to security. For financial information, we might want to go with a vendor with more robust secure communication protocols. This would prevent hackers or other market participants from tracking the AI prompts that we are submitting.
  3. Transparent and explainable. We probably want to check answers obtained from multiple large and small language models (LLMs, SLMs). Specific purpose models may be better at interpreting financial information than a model focused on general internet communications.

Summary

We have developed an AI/ML application using a combination of machine learning (XGBoost) and Generative AI. The model is explainable the whole way through the process. It also allows for the project constraints to be addressed.

  • Ethically Sourced. The firm should adopt an AI governance framework that checks for ethical sourcing when we go through our vendor selection process. That way, all of the approved vendors will be compliant with our goals.
  • Data Security. The AI governance policy should require an acceptable level of security – both communication protocols and ring fencing our requests from outside parties. For example, if this tool were used to summarize information for a potential merger or acquisition situation, the firm might not want other market participants to find out about that early.
  • Transparency. In this case, we keyed our generative-AI analysis to specific pieces of data obtained from an explainable machine-learning model. As a result, it’s easy to explain why we are making certain forecasts. Our AI-output also contains a link to the original source of the data that users can use to gather more information.

This AI application makes life a little bit easier for anyone that has to generate these reports. It’s also easy to extend things in the future. For example, if we were to run these reports daily, they don’t have to be automatically sent out. It would be easy just to send alerts to key people under unusual market conditions (See Figure – Text Message Alert).

Article content
Figure - Text Message Alert


To view or add a comment, sign in

More articles by Davis Edwards

  • Avoiding the Pitfalls of Portfolio Optimization

    Key terms: Portfolio Management, Multi-Asset, Investment, Asset Allocation, Optimization, Machine Learning, Python…

  • What Textbooks and ChatGPT won’t tell you about Value at Risk (VAR) Models

    One of the big issues with value at risk (VAR) models is that almost all sources describe VAR by its implementation…

    1 Comment
  • AI/ML Governance – Don’t put the cart in front of the horse.

    Business leaders understand the value of developing an engine to drive growth (the horse) before they focus on…

  • The Four Classical Blunders of Time Series Forecasting

    Could you be falling victim to one of these classic modeling blunders? These four blunders aren’t quite as well-known…

    3 Comments
  • Taming the EUC Beast

    Four simple suggestions that will massively reduce the risks of using end-user computing applications (EUCs) like Excel…

    1 Comment
  • Blockchain - What is it good for?

    I am a professional quant – a combination of mathematician, computer programmer, and financial professional. I have…

    1 Comment
  • A Practical Introduction to Cyber Risk Management

    I am a professional quant – a combination of mathematician, computer programmer, and financial professional. I have…

    2 Comments
  • Introduction to Option Valuation

    I am a professional quant – a combination of mathematician, computer programmer, and financial professional. I have…

  • An Introduction to Trading

    I am a professional quant – a combination of mathematician, computer programmer, and financial professional. I have…

  • The Art and Practice of Model Risk Management

    I am a professional quant – a combination of mathematician, computer programmer, and financial professional. I have…

    2 Comments

Insights from the community

Explore topics