Calculating prediction...
Our College Acceptance Predictor has achieved an accuracy of 90.99% on our testing dataset. This accuracy is calculated based on the Mean Absolute Error (MAE) of 0.3604 across four possible prediction classes.
We collected a large dataset from r/collegeresults, which was then preprocessed and categorized using GPT-4o. This involved extracting relevant features such as GPA, test scores, extracurricular activities, and more.
We applied various feature engineering techniques, including:
Our prediction system uses an ensemble of two models:
We used k-fold cross-validation and hyperparameter tuning via Optuna to optimize our models. The training process involved handling class imbalance through techniques like SMOTE (Synthetic Minority Over-sampling Technique).
When a user submits an application, the following steps occur:
Note: While our model shows high accuracy on our testing dataset, individual predictions may vary. Many factors in college admissions are subjective and can't be perfectly predicted by any model. Use this tool as a guide, not as a definitive answer.
We are two high school students passionate about making college admissions more transparent. Running this service costs a lot due to the use of GPT-4o and other advanced AI models.
Your donations help us keep this service running and improve it further. Any amount is greatly appreciated!
While we use GPT-4o for labeling our training and testing data, the actual prediction process is fundamentally different and more sophisticated:
This approach allows us to leverage the strength of GPT-4o in natural language understanding while using specialized machine learning models for the actual prediction task.
We've made the decision not to release our source code for several reasons:
We are currently in the process of writing a comprehensive paper detailing our methodology, results, and findings. We anticipate releasing this paper in the coming months. The paper will provide in-depth insights into our approach, including:
We're excited to share our findings with the academic community and contribute to the ongoing discussion about predictive models in education.
If you're interested in sponsoring our project or collaborating with us, please reach out to John Tian at [email protected]. We're always open to partnerships that can help us improve our tool and make it more accessible to students worldwide.