I built an LLM-powered tool for competitive exam explanations and decided to low key test the "solutions" part for one of the JEE Mains 2025 paper (India's most competitive engineering entrance exam with ~1.2M students).
Raw results:
- 75 total questions
- 67 correct answers
- 6 questions couldn't be processed (required diagram input - not supported yet)
- 2 incorrect
- 97% accuracy on processable questions, 89% overall
The JEE covers advanced physics, chemistry, and mathematics at a level that traditionally requires years of intensive preparation.
The two failures were revealing:
Physics optics problem: The LLM made a sign error when differentiating the mirror equation for image acceleration. My extensive formatting rules could have also led to this which I want to look further into.
Chemical kinetics problem: Failed on a numerical simplification step. The official solution uses a neat trick of replacing e^-23.031 with e^(ln 10 × 10) to make the arithmetic manageable. The LLM computed the raw exponential instead and accumulated rounding errors.
Both were numerical answer questions (no multiple choice options to guide toward the right approach).
I think it's too early to comment about any kind of reliability but I find the results very interesting.
Will be working on more JEE papers soon and report back with culmulative stats with more questions.
Raw results: - 75 total questions - 67 correct answers - 6 questions couldn't be processed (required diagram input - not supported yet) - 2 incorrect - 97% accuracy on processable questions, 89% overall
The JEE covers advanced physics, chemistry, and mathematics at a level that traditionally requires years of intensive preparation.
The two failures were revealing:
Physics optics problem: The LLM made a sign error when differentiating the mirror equation for image acceleration. My extensive formatting rules could have also led to this which I want to look further into.
Chemical kinetics problem: Failed on a numerical simplification step. The official solution uses a neat trick of replacing e^-23.031 with e^(ln 10 × 10) to make the arithmetic manageable. The LLM computed the raw exponential instead and accumulated rounding errors.
Both were numerical answer questions (no multiple choice options to guide toward the right approach).
I think it's too early to comment about any kind of reliability but I find the results very interesting.
Will be working on more JEE papers soon and report back with culmulative stats with more questions.