28 Feb 2025
I recently went to a physics teachers’ conference and found out that Colin Hopkins, a retired but still prominent and very active member of the community, has been working on something that I very much care about and want to see succeed. It is not only a fabulous idea that should have been done a long time ago, but also something that I would like to see implemented in every other subject.
For some background, Colin has been collecting past exam papers for longer than the education department has access to, and he has been sorting all of the questions from each of those papers out based on concepts/skills. For example with Newton’s Laws, he has further divided the subtopics down specifically to F=ma with constant acceleration, and coding each question for the question type (whether it is multiple choice, a calculation or an explanation question).
He also used the examiner's reports to make a note of what types of questions students did poorly on and the common mistakes that were made for each question. The most impressive part in all this is that he went on to create exam style multiple choice questions out of the data, such that every single option to every question is exclusively comprised of with common misconceptions that students are likely to select.
Colin is now at the stage where all the questions and answers are ready, and the next step he is working on is trying to find an easy way to let teachers convert their class results into a Guttman chart, analyse it for areas of weakness, and then link students to similar questions in his database for additional practice materials. I want to help him find an AI-assisted solution that can automate part of this process, and potentially even expand its impact for students and teachers.
I started by simply following his original idea, which was to give all of the data and instructions to ChatGPT to see what happens. Despite how many ideas I thought I had about what could work better, of course it was not so easy. Sometimes questions were missing and some students were not marked, or questions were marked but done incorrectly. By the end it even marked and provided results for students that didn't exist at all. Rubbish in, rubbish out.
The next step was to explore a few more advanced options, such as creating my own 'GPT' or using the API 'Assistants' in the hope of being able to create consistently useful outputs. (Oct 2025 update: this feature has now been deprecated by the Agent Builder).
The idea was simple enough, and I had everything I needed to get started. I'll spare you the details, but in the end my long winding adventure into API keys, system instructions and rendering markdown formatting did not result in a useful solution. However, I did discover through the process that LLMs in general were still not reliable enough to handle large amounts of data accurately... and that it is very cheap to use if you know what to ask for. With the right setup, the current (Feb 2025) pricing of GPT-4o mini output is USD$0.60, measured by the millions of tokens, which will last for quite some time for most purposes. For comparison, the latest GPT-4.5 costs USD$150 per 1 million output tokens (250 times the price).
Eventually I decided to go back to basics and pursue the path of good old excel. With the help of AI again, I discovered that there was a way to excel formula everything I needed into the sheet, such that you only need to provide student responses and answers and it would take care of the rest. Neat!
Interestingly, the hurdle that took me the longest time to get over was the fact that Google Sheets does not allow for horizontal sorting with a formula. The Microsoft version of Excel in Office is slightly different and it has that function, so everything worked well after the migration. With the handy sort function on hand, the final step was to simply write out the formulas, format them nicely and they were good to go.
Or so I thought! Life would be boring if things were so easy.
Turns out that as soon as Microsoft (or Google) forms receives a new response, the excel includes the new data by inserting a new row which shifts all other cells (including references) down. As long as something is outside of the automatically created OfficeForms.Table , even absolute references will get shifted down! I have never see this happen anywhere else, so there was a lot of testing and experimenting involved to find out what was happening.
Keeping the references intact would normally be a good feature because everything stays the same as before and nothing gets messed up, except now I actually want the new responses to show up in my formulas. The only solution I could think of was to use a dynamic type of referencing, which is anything that does not involve naming the specific cells that you want. By using something like =OfficeForms.Table[Id] which will return the entire array from the Id column of the OfficeForms.Table table, any new form responses that get added to the table will show up as well. As a bonus side note: =OfficeForms.Table[#Data] is a special case that will just return only all rows and all columns in the table body, but it does not include the column headers.
Eventually after many long days of staring at sorting functions and screens covered in 1s and 0s, I was finally able to create a fully automatic Guttman chart sorting program. It includes integration with MS Forms so that students can click on multiple choice options for each question, submit the form digitally, and the teacher would automatically open up the Excel document to a fully sorted and colour coded Guttman chart.
Using the data that Colin meticulously collated and organised, I later also added additional statistics that can be viewed at a glance like a dashboard next to the Guttman data. Information such as how well the class completed each question in comparison to the national average for that question, what was the most common incorrect response to each question, the potential misconceptions that could have led to students choosing their incorrect response, and what questions they can work on to practice with based on their results.
The next stage will be to develop a way for this to expand beyond pre-built multiple choice questions, and hopefully become more flexible and useable for short answer and extended response questions too.