Research Experience
Violence Inciting Text Detection (VITD) in Bangla
2023 | Shared Task at the BLP Workshop, co-located with EMNLP
- Applied semi-supervised self-training techniques to address class imbalance and small dataset limitations, significantly improving model performance.
- Enhanced dataset diversity and quality through back-translation using the Googletrans API, which improved semantic variety while correcting linguistic inconsistencies.
- Implemented an ensemble approach combining multiple models with bagging and majority voting to reduce prediction variance and increase overall robustness.
- Publication: Team_Syrax at BLP-2023 Task 1: Data Augmentation and Ensemble Based Approach for Violence Inciting Text Detection in Bangla
BRAINTEASER: A Novel Task Defying Common Sense
2024 | Shared Task at SemEval 2024, co-located with NAACL
- Designed data augmentation pipelines to improve model robustness for complex commonsense reasoning tasks.
- Conducted a comparative analysis of advanced language models.
- Publication: Deja Vu at SemEval 2024 Task 9: A Comparative Study of Advanced Language Models for Commonsense Reasoning
Improving Answer Space Diversity in Visual Question Answering (VQA)
2022 | Undergraduate Thesis Project
- Conducted a comparative study of various state-of-the-art VQA methods, identifying and analyzing their core limitations, particularly in answer distribution.
- Addressed the “Answer Space Diversity” limitation in the VQA v2 dataset by augmenting training data with automatically generated, contextually relevant question-answer pairs.
- Demonstrated improved model performance on long-tail and less frequent answer categories.