AI Use Case: Demographic Data Analysis
- Kevin D
- Apr 22
- 3 min read
Our schools have a large pool of demographic data which could be use to look at trends in academic performance or enrollment cases. In this case, we are going to take anonymized student data and just see what ChatGPT notices.
Step 1: The Data
I took an old poolbook document - listing student name, address, demographics, and test scores - that we used to figure out Title Allotments with each of our districts. I deleted names, birthdates, and addresses.
Remember - what you feed in may be used for training data or may be publicly available - unless your LLM follows required guidelines.
Step 2: Making sure ChatGPT knows what we're doing
I wanted to create a prompt and I like the term poolbook so I started the conversation there.

Okay - prompt given and poolbook defined!
Step 3: Clarifying
One of the most powerful prompting tools to seek clarification questions from ChatGPT. Simply append a request at the end of your initial prompt telling ChatGPT to ask questions prior to generating.

And my answers:

Okay - looks like we are ready to go; but you could specify more if that helped!
And upload...

Step 4: The Report
So ChatGPT started by giving me an overview - and right away noted that the STAR data for Reading was a mixture of types. Something to fix for the next time!


I asked for one whole report - which generated a snapshot and insights


It was the dig deeper that was even more interesting, breaking down scores by gender, free-reduced lunch status, ELL-status, and ethnicity, all in table form. Then we got some conclusions:
📘 Scores by Gender
Girls slightly outperform boys in both Reading and Math on teacher scores, but STAR data shows similar averages.
May indicate differences in teacher perception or classroom engagement by gender.
🍎 Scores by Free/Reduced Lunch (FRL) Status
Students not eligible for FRL scored significantly higher on both STAR and teacher assessments in Reading and Math.
Indicates economic status is a key factor in academic outcomes—consider targeted support/intervention.
🌍 Scores by ELL Status
ELL students show notably lower performance across all metrics.
Suggests a strong case for enhanced language support, integrated literacy strategies, and ongoing teacher PD for ELL best practices.
🌐 Scores by Ethnicity
Hispanic students (coded as "H") are the largest group and perform below the overall average on both STAR and teacher assessments.
One group (“A”) shows unusually high scores—this may be a data input error or an outlier.
Consider reviewing the cultural responsiveness of instruction and the impact of language or access barriers on assessment.
A similar breakdown occurred by zip code, along with observations. Finally, a conclusion:
🧭 Next Recommendations
Targeted Recruitment:
Grow in 90304 and re-engage 90301 with community/parish outreach.
Consider family surveys in 90047 to assess potential enrollment barriers.
Support Strategy:
Focus literacy and math supports in 90250 and 90303.
Layer ELL and FRL supports where zip codes + subgroups intersect.
Communications:
Highlight STAR growth or success stories from high-performing zips in outreach materials.
Host feeder school/parish info nights in underrepresented areas.
ChatGPT also gave me the opportunity to generate a poorly formatted PDF report:

Reflection
Having an analysis with data points I'm not as familiar with (zip codes) was interesting—along with seeing the data sliced in different ways at my fingertips. With additional data and the opportunity to interrogate it with the chatbot, I think this could be very valuable for office and teaching staff.
Comments