You can find the full list of my projects on my GitHub account. research Mixed-modal Reasoning Trained 3 paradigms of visual reasoning using GRPO Multi-turn RL Extended the VeRL framework to support training multimodal models with multi-turn reinforcement learning with external tools. university Visual Reasoning Explored GRPO to enhance visual question answering in vision-language models GalactiTA 1.3B LLM trained through a 3-stage pipeline of SFT, DPO, and RAG-tuning on scientific datasets. YouTube Analysis Analysis of Tech channels on YouTube using the videos published between May 2005 and October 2019 Stance Detection Fine-tuning Large Language Models for argument stance detection in unseen domains Mountain Car Handling sparse reward challenges in reinforcement learning using DQN and Dyna-Q algorithms Segmentation and Classification Using classic computer vision techniques to segment and extract, and deep learning for the classification Predicting Cardiovascular Diseases Using machine learning on behavioral risk factor data to predict heart disease Document Retrieval Built an efficient IR system across 7 languages with computational limits Recommender Systems Compares collaborative filtering, matrix factorization, and neural networks miscellaneous Satellite Imagery 🥇1st Place🥇 - Hackathon on analyzing satellite imagery based on LLMs and CV (Lauzhack 2024, AXA challenge) LLM training 🥈2nd Place🥈 - Hackathon on LLM training & architecture