Translating Natural Language to API Calls

One of the standout hurdles was translating natural language queries into API calls that the Next Gen Stats database could handle. Users’ conversational phrasing, acronyms, and typos added complexity. The solution was elegant: a combination of traditional NLP techniques and large language models to identify key entities like players and actions. For example, searching “Mahomes touchdowns” would correctly parse “Mahomes” as a player and “touchdowns” as the action.

Handling Complex Queries with Multiple Constraints

Interpreting queries with layered constraints like “Tom Brady touchdowns on third down in the fourth quarter” presented another challenge. The NFL tackled this with:

Efficient Entity Recognition with Potential Ambiguities

Ambiguities in identifying players and teams were another obstacle, especially with typos, acronyms, or duplicate names. The NFL combined traditional NLP and LLMs to resolve ambiguities. When duplicate names arose, the system prompted users for clarification, ensuring accuracy.

Evaluating Performance Without Ground Truth Data

Without initial benchmarks, the team struggled to measure accuracy. They partnered with NFL experts to create test questions across difficulty levels. This allowed for both quantitative accuracy metrics and qualitative insights into the model’s strengths and weaknesses.

Enabling Continuous Learning and Improvement

Building a system that evolves was a critical goal. The NFL implemented feedback loops and semantic search powered by a vector database to:

This iterative learning framework has made the system faster, smarter, and more cost-effective over time.