AIMultiple ResearchAIMultiple ResearchAIMultiple Research
We follow ethical norms & our process for objectivity.
AIMultiple's customers in agentic finance include Stack AI.
Agentic Finance
Updated on Aug 27, 2025

AI-Based Stock Trading: Which Gen AI Tool Is Better [2025]

Although LLM tools have been used in AI-based stock trading since their emergence,1 the growing availability of these tools raises the question of which generative AI solution offers the best accuracy in predicting stock markets.

In an experimental setting, I tested several generative AI tools for AI-based stock trading to evaluate their ability to forecast stock price changes based on the provided information. The results show that the ChatGPT 5 Thinking model and the Gemini 2.5 Pro model delivered the best performance.

Performance of AI-powered tools

Updated at 08-27-2025
AgentsSuccess ratesFocus in Categorization
GPT 5 Thinking74.24%Leadership concentration & renewal potential
Gemini 2.5 Pro71.21%Firm vulnerability & turnaround catalyst
GPT 5 Pro56.06%Key‑person risk & turnaround potential
GPT 4o46.21%Leadership role, family control, firm size & leverage
Claude Sonnet46.21%Succession disruption & governance renewal
DeepSeek27.27%Leadership role, financial health & family ownership
Gemini 2.5 Flash22.73%Family ownership & leadership position

For further details on the benchmark, read the stock trading benchmark methodology section.

GPT 5 Thinking model

The Thinking model of ChatGPT 5 presents the highest accuracy among the tested tools, with a 74% success rate. The tool forecasts price change based on two indicators:

Leadership concentration index (LCI) → higher = more likely substantial negative CAR

  • LCI = 0.40·z(role_importance) + 0.30·z(family_control) + 0.20·z(financial_strength) − 0.10·z(size)
    • role_importance: hierarchical weight of deceased (CEO > president > chairman > vice-president).
    • family_control: family ownership (% of voting rights).
    • financial_strength: composite of ROE and ROA (profitability).
    • size: ln(assets).
  • Intuition: Markets expect more disruption when a highly central family member dies at a tightly held, profitable, but relatively smaller firm.
  • Decision: Label Negative substantially if LCI is in the top 30% of the sample and ≥ 0.5 z-units above the Renewal Index.

Renewal potential index (RPI) → higher = more likely substantial positive CAR

  • RPI = 0.40·(−z(financial_strength)) + 0.25·z(leverage) − 0.20·z(family_control) − 0.10·z(size) + 0.05·z(liquidity_stress)
    • leverage: (long-term + short-term debt) / equity.
    • family_control: family ownership (% of voting rights).
    • liquidity_stress: accounts payable / assets.
    • financial_strength: composite of ROE and ROA.
    • size: ln(assets).
  • Intuition: Firms with weak profitability, some debt pressure, lower family dominance, and liquidity strain may see the market welcome the possibility of governance change or fresh leadership.
  • Decision: Label Positive substantially if RPI is in the top 30% of the sample and ≥ 0.5 z-units above the LCI.

Gemini 2.5 Pro model

Gemini 2.5 Pro predicts 71% of stock price changes accurately. This model suggests that active traders make decisions based on firm vulnerability and potential opportunity for renewal.

Vulnerability index (VI) → higher = more likely substantial negative CAR

  • VI = 0.40·z(family_control) + 0.35·(-z(financial_strength)) + 0.20·z(leverage) – 0.05·z(size)
    • family_control: Family ownership (% voting rights).
    • financial_strength: Composite z-score of ROE and ROA.
    • leverage: Long-term debt / equity.
    • size: Natural log of total assets (ln(asset)).
  • Intuition: The market punishes the stock when a key leader’s death creates a power vacuum. This risk is highest in highly indebted, unprofitable, and family-dominated firms that lack the resilience and deep management structure of larger corporations. The combination of high family control, poor financial health, and high leverage creates a potent mix for investor uncertainty.
  • Decision: Label Substantially Negative if a firm’s VI score is in the top 5% of the sample.

Turnaround catalyst index (TCI) → higher = more likely substantial positive CAR

  • TCI = 0.50·(-z(financial_strength)) + 0.25·z(family_control) – 0.15·z(leverage) – 0.10·z(size)
    • financial_strength: Composite z-score of ROE and ROA.
    • family_control: Family ownership (% voting rights).
    • leverage: Long-term debt / equity.
    • size: Natural log of total assets (ln(asset)).
  • Intuition: The market reacts positively when death is perceived as an opportunity for renewal. This occurs when an entrenched family leader at an underperforming but financially stable company passes away. The market anticipates that a change in leadership will unlock value by improving strategy and operations, making the firm a potential turnaround story or a takeover target.
  • Decision: Label Substantially Positive if a firm’s TCI score is in the top 5% of the sample and is at least 0.5 z-units above its VI score.

GPT 5 Pro model

Accuracy rate of GPT 5 Pro is 56% for my benchmark. The GenAI tool makes predictions based on two indicators:

Key‑person risk index (KPRI) → higher = more likely substantial negative CAR

  • KPRI = 0.40·z(ownership) + 0.30·z(leverage) − 0.20·z(size) + 0.10·z(profitability)
  • Intuition: tightly held, more levered, smaller, and currently profitable firms face higher perceived key‑person risk at a family member’s death.
  • Decision: Label negative substantially if KPRI is in the top 30% of the sample and ≥ 0.5 z‑units above TPI.

Turnaround potential index (TPI) → higher = more likely substantial positive CAR

  • TPI = 0.40·(−z(profitability)) + 0.20·z(leverage) − 0.20·z(ownership) − 0.10·z(size) + 0.10·z(AP/assets)
  • Intuition: weak performance + some financial pressure, but lower family control can make markets welcome leadership change.
  • Decision: Label positive substantially if TPI is in the top 30% and ≥ 0.5 z‑units above KPRI.
GPT 5 Pro includes the predicted 3-day CAR categories image in the AI-based stock trading benchmark

GPT 4o

This old ChatGPT model uses AI algorithms based on the role of the deceased in the firm, the family’s ownership, firm size, and financial leverage. The model predicts events’ CAR as

Substantial negative, if

  • The deceased is CEO/Chairman
  • High family ownership (>70%) & the deceased held a leadership role
  • Smaller or less diversified firms (low assets/revenue)
  • High leverage: long-term debt or short-term debt > assets

Substantial positive, if

  • Deceased had a minor role (e.g., board member or VP)
  • The company was underperforming (e.g., negative ROE or ROA), markets may see this as positive
  • Low family control (<30%)

No significant change, if

  • Medium to large firms with strong financials
  • Deceased not in active leadership
  • Low to medium family ownership (30%–60%)

Claude Sonnet 4

Claude Sonnet 4 achieves a 46% accuracy rate in predicting stock price movements following family leadership deaths. This model employs a multi-factor scoring system that weighs leadership succession risk against firm resilience factors.

Succession disruption score (SDS) → higher = more likely substantial negative CAR

  • SDS = 0.30·z(position_weight) + 0.25·z(family_ownership) + 0.20·(-z(financial_performance)) + 0.15·z(debt_burden) – 0.10·z(firm_scale)
    • position_weight: Hierarchical scoring where CEO = 3, Chairman = 2.5, President = 2, VP = 1, Board = 0.5
    • family_ownership: Family voting control percentage
    • financial_performance: Composite score of ROE and ROA z-scores
    • debt_burden: Long-term debt to assets ratio
    • firm_scale: Employee count as proxy for organizational depth
  • Intuition: Markets react most negatively when a critical leadership void combines with concentrated family control and weak institutional resilience. The death of a CEO or Chairman in a family-dominated firm creates an immediate succession crisis, particularly when the firm lacks the financial strength to weather uncertainty or the organizational depth to ensure continuity. High debt amplifies this vulnerability by constraining strategic flexibility during the transition period.
  • Decision: Label Substantially Negative if SDS is in the top 36% of the sample (score ≤ -3.0).

Governance renewal index (GRI) → higher = more likely substantial positive CAR

  • GRI = 0.35·(-z(financial_performance)) + 0.25·z(institutional_quality) – 0.20·z(family_ownership) + 0.15·z(market_development) – 0.05·z(position_weight)
    • financial_performance: Composite ROE/ROA weakness score
    • institutional_quality: Firm size and sector stability indicators
    • family_ownership: Family control concentration (inverted)
    • market_development: Country-based market efficiency proxy
    • position_weight: Leadership position importance (inverted)
  • Intuition: Markets anticipate value creation when leadership change occurs in underperforming firms with dispersed ownership structures. The death removes potential entrenchment effects while preserving institutional capabilities needed for turnaround. This is especially pronounced in developed markets where professional management succession is more readily available and governance mechanisms are stronger.
  • Decision: Label Substantially Positive if GRI score is in the top 17% of the sample (score ≥ 1.5) and exceeds SDS by at least 2.0 points.

DeepSeek

This generative AI tool uses expert heuristic analysis, achieving an estimated accuracy rate of ~65% on standard financial event-study benchmarks. The core of the decision weights assessment of three primary factors:

Role of the deceased

  • Chairman/CEO/President: Immediately flagged for high potential negative impact.
  • Emeritus/Honorary/VP: Weighted significantly less, often leading to a “No Significant Change” prediction.
  • Missing (#N/A) or Minor Roles: Treated as a weak signal.

Financial health

  • Net income and ROE/ROA: A negative net income or very low returns paired with a key role often pushed the prediction toward “Negative.” If the role was minor, it could suggest “Positive.”
  • Long-term/Short-term debt: High debt levels amplified the perceived risk for firms with key roles.

Family ownership

  • High Ownership (>60%) + Key Role: Strongly amplified the “Negative” prediction (entrenched leadership, succession uncertainty).
  • Low Ownership (<30%) + Poor Performance: Amplified the “Positive” prediction (easier for outsiders to force change).

Gemini 2.5 Flash model

Gemini 2.5 Flash states that the predictions are made based on an event study and corporate governance literature, presenting a 23% accuracy rate. The model labels event CARs based on these assumptions:

  • Substantial Negative: The family ownership percentage is relatively high (>30%) and the deceased held a critical position (CEO or Chairman).
  • No Significant Change: The family ownership percentage is relatively low (<30%) or the firm is large and appears to have a strong corporate structure.
  • Substantial Positive: This scenario typically occurs when the manager’s performance is poor or when they are believed to be an obstacle to the company’s future. There is not enough information in the provided data to make such a prediction. Therefore, all predictions were labeled as either “Substantial Negative” or “No Significant Change.”

AI-based stock trading benchmark methodology

Prompting

The benchmark evaluates whether generative AI tools can predict stock market reactions to an unexpected event, based on given company fundamentals. The setup relies on data from Tanyeri & Alp (2023):
2

Each AI tool receives a snapshot of firm-level information:

Financial information

  • Asset size
  • Equity size
  • Earnings before interest, tax, depreciation, and amortization (EBITDA)
  • Net income
  • Yearly revenue
  • Long- and short-term debt
  • Accounts payable
  • Return on equity (ROE)
  • Return on assets (ROA)

Other information

  • Family ownership stake
  • Country of headquarters and stock listing
  • Number of employees
  • Industry/sector

No firm name or other identifiers are provided.

Main question

Given the information above, each AI solution is asked to predict whether the 3-day cumulative abnormal returns (CAR) of 132 firms will be:

  • Significantly positive
  • Significantly negative
  • Not significant

CAR measures how financial markets respond to the event. A positive CAR indicates that stock traders perceive the event as value-enhancing, a negative CAR as value-reducing, and an insignificant CAR as neutral.

Sampling

The dataset includes 132 death events in 109 publicly traded family firms across 24 countries. All firms are ranked among the 500 largest family firms.

Performance measurement

The benchmark builds on prior technical analysis of stock prices. For each firm, the 3-day CAR has been calculated and categorized as:

  • Significantly positive
  • Significantly negative
  • Not significant

The AI predictions are compared with historical CAR values. Accuracy is measured as the percentage of correct predictions made by each generative AI solution.

FAQs

Is AI good for stock trading?

While AI stock pickers and AI-powered tools may help identify patterns and reduce emotional bias, stock trading still carries risks. Active traders should combine AI capabilities with their own research, strategy development, and awareness of market conditions to make better-informed decisions.
AI can be useful in stock trading because it can analyze vast amounts of market data, historical data, and real-time insights faster than humans. AI trading bots and AI-powered trading bots use trading algorithms, technical indicators, and fundamental analysis to spot market trends, generate trading signals, and execute trades. They can support stock traders with trade ideas, portfolio analysis, and risk management across multiple asset classes.

Can AI make me money on the stock market?

shows how profitable AI based stock trading compared to other active etfs

AI can help in stock trading by analyzing market data, historical data, and real-time data faster than humans. AI trading bots use trading algorithms, technical analysis, and fundamental analysis to generate trading signals and execute trades. They can spot market trends, react quickly to news, and provide trade ideas. For example, AI trading bots can react to news releases or Fed minutes within seconds, something no human trader can match.3 However, AI-based stock trading also comes with risks involved, especially during market volatility, when stock trading bots may trigger herd-like selling. AI-powered tools can offer valuable insights, but making informed decisions still requires own research, risk management, and awareness of market conditions.

Share This Article
MailLinkedinX
Ezgi is an Industry Analyst at AIMultiple, specializing in sustainability, survey and sentiment analysis for user insights, as well as firewall management and procurement technologies.

Next to Read

Comments

Your email address will not be published. All fields are required.

0 Comments