According to reports from Morningstar, University of Florida finance professor Alejandro Lopez-Lira has been conducting experiments with AI models like ChatGPT, DeepSeek, and Grok to evaluate their stock-picking capabilities, finding impressive results that suggest these technologies could potentially automate many tasks currently performed by financial analysts.
Danelfin's AI-powered stock selection strategy has demonstrated impressive results, generating a return of +263% from January 2017 to August 2024, significantly outperforming the S&P 500's +189% during the same period.12 The platform's AI Score system shows that US-listed stocks with the highest scores (10/10) outperformed the market by an average of +14.69% (annualized alpha) over three months, while those with the lowest scores (1/10) underperformed by -37.38%.12
Real-world tests of AI stock pickers have yielded mixed but promising results. In one experiment, two AI-selected stocks showed an average return of 10.74% after 30 trading days, with one stock outperforming the S&P 500 by nearly fivefold.3 Another AI platform, AltIndex, claims its stock picks historically achieve a 70% win rate with average gains of 22% over six months.4 However, as with all investment strategies, these platforms typically include disclaimers that past performance doesn't guarantee future results, acknowledging the inherent unpredictability of financial markets despite advanced AI analysis.
Lopez-Lira's approach to testing AI stock-picking capabilities involves a rigorous methodology that has evolved over time. Initially, he conducted a simple experiment to determine if ChatGPT could accurately interpret whether news headlines were positive or negative for stocks, which yielded a remarkable 512% return1. For real-money applications through the Autopilot investment app, he developed a more sophisticated process where AI models assign scores to companies on a scale of 1-100 based on comprehensive data including macroeconomic conditions, geopolitical risks, and company financials2.
The methodology has become increasingly sophisticated as Lopez-Lira removes restrictions on the AI models. He now uses OpenAI's o3, xAI's Grok 3, and DeepSeek R1 to create portfolios of 15 positions, allowing them to determine their own weightings and asset class combinations2. In collaborative research with colleagues from the Federal Reserve and University of Cologne, he employed machine learning to test 200 investing theories and discovered that a specific ratio involving sales from acquisitions and rental expenses outperformed traditional metrics like the book-to-market variable, generating monthly returns of 1.03% after 2012 compared to less than 0.1% for the traditional approach1.
When comparing ChatGPT and DeepSeek for trading applications, each platform demonstrates distinct strengths. ChatGPT excels with complex trading instructions and has proven more effective at capturing economic news that links to market risk premium1. In direct trading strategy tests, ChatGPT performed better with complex indicator challenges, successfully generating a profitable strategy with 514 trades and a 33% win rate2.
DeepSeek, while underperforming ChatGPT in stock market prediction tasks (likely due to ChatGPT's more extensive English language training)1, shows superior capabilities in simple trading strategy development and complex mathematical calculations23. The performance gap appears in practical applications as well—information identified by DeepSeek tends to be immediately incorporated into stock prices without predictive power for future returns, while ChatGPT-identified positive news correlates with both current and subsequent market returns for up to six months1. For traders choosing between these platforms, ChatGPT offers better market contextualization and fundamental analysis, while DeepSeek provides stronger coding capabilities and efficient data processing for technical analysis3.