Research Experience
My research focuses on Natural Language Processing, with emphasis on low-resource languages, multimodal AI, and reasoning systems.
Code Generation in Bangla: Low-Resource Language Adaptation
2025 | Shared Task at the BLP Workshop, co-located with IJCNLP-AACL
- Low-Resource Adaptation: Investigated code generation in Bangla by fine-tuning open-source models (Llama 3, TigerLLM, Qwen) using parameter-efficient LoRA adapters to overcome data scarcity challenges.
- Reasoning Enhancement: Enhanced model reasoning via Chain-of-Thought (CoT) prompting and a self-refinement loop that iteratively critiques and corrects generated code based on execution feedback.
- Error Analysis: Conducted rigorous error analysis to identify failure modes in code generation tasks, achieving 85% accuracy and securing 4th place out of 32 teams.
- Publication: AdversaryAI at BLP-2025 Task 2: A Think, Refine, and Generate (TriGen) System with LoRA and Self-Refinement for Code Generation
BRAINTEASER: Advanced Commonsense Reasoning in Language Models
2024 | Shared Task at SemEval 2024, co-located with NAACL
- Designed data augmentation pipelines to improve model robustness for complex commonsense reasoning tasks involving lateral thinking puzzles.
- Conducted a comparative analysis of advanced language models, analyzing performance gaps and reasoning patterns in non-standard logical scenarios.
- Publication: Deja Vu at SemEval 2024 Task 9: A Comparative Study of Advanced Language Models for Commonsense Reasoning
Violence Inciting Text Detection (VITD) in Bangla
2023 | Shared Task at the BLP Workshop, co-located with EMNLP
- Applied semi-supervised self-training techniques to address class imbalance and small dataset limitations, significantly improving model performance on minority classes.
- Enhanced dataset diversity and quality through back-translation using the Googletrans API, which improved semantic variety while correcting linguistic inconsistencies in Bangla text.
- Implemented an ensemble approach combining multiple transformer models with bagging and majority voting to reduce prediction variance and increase overall robustness, achieving Top 20 rank.
- Publication: Team_Syrax at BLP-2023 Task 1: Data Augmentation and Ensemble Based Approach for Violence Inciting Text Detection in Bangla
Improving Answer Space Diversity in Visual Question Answering (VQA)
2022 | Undergraduate Thesis Project
- Conducted a comparative study of various state-of-the-art VQA methods, identifying and analyzing their core limitations, particularly in answer distribution.
- Addressed the “Answer Space Diversity” limitation in the VQA v2 dataset by augmenting training data with automatically generated, contextually relevant question-answer pairs using template-based synthesis.
- Demonstrated improved model performance on long-tail and less frequent answer categories, reducing the model’s bias toward frequent answers.
