/ 2023Final Year Project · Research to production

Automated Summary Evaluator (FYP)

An LLM-powered system that scores student summaries on content and wording. Datasets sourced from multiple NGOs, deployed via CI/CD pipelines with Docker for consistent real-world execution.

/ The Problem

Educators scoring ADHD student summaries by hand faced consistency issues and long turnaround times. NGOs needed an automated way to grade content and wording at scale.

/ The Approach

Curated labelled summary datasets from multiple NGOs, then trained LLMs as regressors to score content and wording on a continuous scale. Wrapped the model in a CI/CD + Docker pipeline so grading runs identically in research and production.

/ The Impact

Trained LLMs as regressors on curated ADHD datasets from multiple NGOs. Shipped via GitHub CI/CD and Docker for reproducible deployment. Graded summaries in seconds instead of minutes.

/ Highlights

LLMs trained as regressors to score content and wording continuously
Datasets curated and merged from multiple partner NGOs
GitHub Actions CI/CD + Docker for reproducible grading runs
Cut turnaround from minutes-per-summary to seconds

/ Stack

Python
LLMs
CI/CD
Docker
GitHub Actions

View source on GitHub

Next project

CureWise — Agentic RAG for Healthcare