Research

My research explores various areas, including Medical Image Analysis, Explainable Artificial Intelligence, Computer Vision, Generative Adversarial Networks, Multimodal Deep Learning, Large Language Models, Natural Language Processing, Machine Learning, and the applications of Deep Learning.

1. Explainable AI in Medical Image Analysis

Medical image analysis plays a important role in diagnosing and treating diseases, particularly in eye care and cancer treatment, using advanced machine learning models like convolutional neural networks (CNNs). However, the complexity of these models can make their decisions difficult to understand, which is problematic in clinical settings. To address this, explainable AI techniques clarify how CNNs classify eye conditions from retinal images, while new segmentation models accurately identify blood vessels, aiding in vascular health assessment. By combining classification and segmentation, doctors can make better decisions. For lung and colon cancer detection, explainable AI also helps healthcare professionals understand and trust the predictions, improving patient communication and outcomes.

Exploring Explainable AI Techniques for Improved Interpretability in Lung and Colon Cancer Classification. Mukaffi Bin Moin, Fatema Tuj Johora Faria, Swarnajit Saha, Busra Kamal Rafa, Mohammad Shafiul Alam. Presented in 4th International Conference on Computing and Communication Networks (ICCCNet-2024). [PDF]
Explainable Convolutional Neural Networks for Retinal Fundus Classification and Cutting-Edge Segmentation Models for Retinal Blood Vessels from Fundus Images. Fatema Tuj Johora Faria, Mukaffi Bin Moin, Pronay Debnath, Asif Iftekher Fahim, Faisal Muhammad Shah. Under Review in Journal of Visual Communication and Image Representation. [PDF]

2. Multimodal Deep Learning

Multimodal deep learning improves understanding by combining images and text using three techniques: early, late, and intermediate fusion. Early fusion merges raw images and text before processing, allowing shared representation but risking noise sensitivity. Late fusion processes them separately, offering flexibility but possibly missing important connections. Intermediate fusion combines features at different stages, balancing the strengths of both data types. A challenge for Bangla language applications is the lack of diverse annotated image-text datasets. Bangla's unique features complicate text processing, and without specialized models, areas like fake news detection and disaster identification are difficult to improve. Custom models are needed to address these challenges in Bangla.

Uddessho: An Extensive Benchmark Dataset for Multimodal Author Intent Classification in Low-Resource Bangla Language. Fatema Tuj Johora Faria, Mukaffi Bin Moin, Md. Mahfuzur Rahman, Md Morshed Alam Shanto, Asif Iftekher Fahim and Md. Moinul Hoque. Presented in 18th International Conference on Information Technology and Applications (ICITA 2024). [PDF]
BanglaCalamityMMD: A Comprehensive Benchmark Dataset for Multimodal Disaster Identification in the Low-Resource Bangla Language . Fatema Tuj Johora Faria, Mukaffi Bin Moin, Busra Kamal Rafa, Swarnajit Saha, Md. Mahfuzur Rahman, Khan Md Hasib, and M. F. Mridha. Under Review in International Journal of Disaster Risk Reduction.
MultiBanFakeDetect: Integrating Advanced Fusion Techniques for Multimodal Detection of Bangla Fake News in Under-Resourced Contexts . Fatema Tuj Johora Faria, Mukaffi Bin Moin, Md Arafat Alam Khandaker, Niful Islam, Khan Md Hasib, Md Saddam Hossain Mukta, and M. F. Mridha. Under Review in International Journal of Information Management Data Insights .
BanglaMemeEvidence: A Multimodal Benchmark Dataset for Explanatory Evidence Detection in Bengali Memes. Fatema Tuj Johora Faria, Mukaffi Bin Moin, Asif Iftekher Fahim, Pronay Debnath, and Faisal Muhammad Shah. Submitted to an A* Rank Conference.

Sentiment analysis categorizes emotions in text as positive, negative, or neutral, while toxic comment detection focuses on identifying harmful or abusive language, such as personal attacks and discriminatory remarks. Both are crucial in today’s online environment. Large Language Models (LLMs) like Gemini 1.5 Pro and GPT-3.5 Turbo have transformed NLP by learning from raw text, improving performance in tasks like sentiment analysis and toxic comment detection, especially for low-resource languages like Bangla. Sensitive topics such as transgender rights, indigenous issues, and migration are frequent targets of toxic language, but traditional models struggle due to the lack of curated datasets in Bangla. LLMs help mitigate this by performing well even with limited data, but high-quality, issue-specific datasets are still necessary to enhance accuracy, foster inclusivity, and protect marginalized communities from harmful content. Accurate models are essential for detecting toxicity, promoting fairness, and improving comment classification, particularly in languages with fewer resources like Bangla.

Motamot: A Dataset for Revealing the Supremacy of Large Language Models over Transformer Models in Bengali Political Sentiment Analysis. Fatema Tuj Johora Faria*, Mukaffi Bin Moin*, Rabeya Islam Mumu, Md Mahabubul Alam Abir, Abrar Nawar Alfy and Mohammad Shafiul Alam Presented in The IEEE Region 10 Symposium (TENSYMP 2024). [PDF]
Assessing the Level of Toxicity Against Distinct Groups in Bangla Social Media Comments: A Comprehensive Investigation . Mukaffi Bin Moin, Pronay Debnath, Usafa Akther Rifa, Rijeet Bin Anis. Presented in 18th International Conference on Information Technology and Applications (ICITA 2024). [PDF]

4. Natural Language Inference

Natural Language Inference (NLI) is a crucial NLP task that determines whether a premise supports, contradicts, or is unrelated to a hypothesis, aiding applications like question answering, information retrieval, and chatbots. NLI has three categories: entailment (the hypothesis follows from the premise), contradiction (both cannot be true), and neutral (the hypothesis is independent of the premise). It is especially important for languages like Bangla, improving NLP models' ability to interpret Bangla text and meet the growing demand for tools like chatbots and virtual assistants. Large Language Models (LLMs) enhance NLI by learning subtle sentence relationships and can be fine-tuned for specific tasks, making them effective even with limited labeled data.

Unraveling the Dominance of Large Language Models Over Transformer Models for Bangla Natural Language Inference: A Comprehensive Study. Fatema Tuj Johora Faria, Mukaffi Bin Moin, Asif Iftekher Fahim, Pronay Debnath, Faisal Muhammad Shah Presented in 4th International Conference on Computing and Communication Networks (ICCCNet-2024). [PDF]

5. Text Generation in Bengali

Text generation in NLP involves creating human-like text using models, with tasks like paraphrase generation, reading comprehension, and formal document creation. Large Language Models (LLMs) excel at understanding context and generating varied expressions of the same idea, which is useful for Bengali educational content and creative writing. Fine-tuning LLMs with Bengali datasets enhances their ability to answer questions, summarize information, and generate formal documents, aided by techniques like Retrieval-Augmented Generation (RAG). In the mental health domain, LLMs can offer empathetic, culturally relevant advice by training on specialized datasets. While challenges like limited data hinder development for low-resource languages like Bangla, methods like fine-tuning and few-shot learning help LLMs perform well. Ethical considerations are vital to ensure generated content, especially in mental health, is safe, reliable, and culturally sensitive.

Tackling Hallucination in Bengali NLP: Enhancing Paraphrase Generation, Reading Comprehension, and Formal Application Writing Using LLMs with Few-Shot Learning, Fine-Tuning, and RAG . Saidur Rahman Sujon, Ahmadul Karim Chowdhury, Fatema Tuj Johora Faria, Mukaffi Bin Moin and Faisal Muhammad Shah Submitted to an A* Rank Conference.

Ongoing Work:

Leveraging LLMs for Mental Health Advice Generation in Low-Resource Bangla Language.

6. Image-to-Text Generation

Image-to-text generation in agriculture, particularly for disease diagnosis and recommendations, is a growing field with great potential. Large vision models like LLaMA 1.5, InstructBLIP, GPT-4, and Fuyu can analyze and interpret visual features in plant images, such as color changes, texture, and shapes, to accurately diagnose plant diseases. These models generate detailed textual descriptions, explaining the disease, its symptoms, and suggesting treatments. Key applications include identifying plant diseases through visual symptoms like discoloration or wilting, helping farmers and agronomists make informed decisions for effective disease management.

Ongoing Work:

Image-to-Text Generation for Agricultural Disease Diagnosis and Recommendations.

7. Natural Language Processing for Medical Question Answering

Developing a question-answering (QA) system in low-resource languages like Bangla, particularly in the medical field, presents challenges due to limited datasets and pre-trained models. The system must process medical literature, clinical notes, and patient records to respond accurately to queries, despite the scarcity of Bangla medical resources. It also needs to handle complex medical terminology and English code-switching. Such a system could support healthcare professionals in making informed decisions, especially in rural areas, and improve patient education by answering medical questions in Bangla. Addressing various question types (factoid, list, confirmation, etc.), it must overcome data scarcity through methods like crowdsourcing and domain expert involvement to build annotated medical datasets in Bangla.

Ongoing Work:

BanglaMedQA: A Comprehensive Benchmark Dataset for Medical Question Answering.

8. Machine Translation and Regional Dialect Detection

Machine Translation (MT) in natural language processing (NLP) helps automatically translate text between languages, with Transformer models improving translation speed and accuracy. However, for low-resource languages, including dialects spoken by marginalized communities, there are challenges due to limited linguistic resources like annotated datasets. In Bangladesh, regional Bangla dialects such as those from Sylhet, Noakhali, and Mymensingh differ significantly from Standard Bangla, creating challenges for Dialect Machine Translation (DMT). These dialects have unique expressions that may not easily translate, and the lack of dialect-specific datasets further complicates model development. Similarly, Dialect Text Classification organizes text by dialect, enabling applications like regional content targeting, public sentiment analysis, and social media insights. Both tasks require careful handling of dialect variability and cultural nuances for effective translation and classification.

Vashantor: A Large-scale Multilingual Benchmark Dataset for Automated Translation of Bangla Regional Dialects to Bangla Language. Fatema Tuj Johora Faria, Mukaffi Bin Moin, Ahmed Al Wase, Mehidi Ahmmed, Md Rabius Sani, and Tashreef Muhammad. Under Review in Neural Computing and Applications . [PDF]

9. Generative Adversarial Networks in Agriculture

Generative Adversarial Networks (GANs) have transformed machine learning, particularly in agriculture, by enabling synthetic data generation to improve disease detection, such as for potato crops. Gathering images of infected potatoes at various stages is often difficult, but GANs create realistic, diverse images that enhance training datasets, improving models' ability to generalize and diagnose diseases accurately. This innovation helps researchers and farmers develop more effective diagnostic tools. Additionally, explainable AI builds trust by offering transparency in disease classification, fostering confidence among agricultural professionals. Instance segmentation further aids potato disease detection by identifying infected areas at the pixel level, enabling precise analysis of diseases like Black Scurf, Common Scab, Dry Rot, and Pink Rot. This technique helps differentiate between healthy and diseased tissues, assess disease severity, and track its progression, allowing for timely interventions and better crop management.

PotatoGANs: Utilizing Generative Adversarial Networks, Instance Segmentation, and Explainable AI for Enhanced Potato Disease Identification and Classification. Mohammad Shafiul Alam*, Fatema Tuj Johora Faria*, Mukaffi Bin Moin*, Ahmed Al Wase, Md. Rabius Sani and Khan Md Hasib. Under Review in Journal of Intelligent Information Systems . [PDF]

10. Computer Vision Applications in Agriculture

Disease classification is essential for sustainable farming and food security, as crop diseases can cause financial losses and disrupt food supply chains. Timely detection helps manage outbreaks and ensure healthy yields. Potatoes, a key staple, are prone to diseases like Black Scurf and Common Scab. Machine learning, particularly convolutional neural networks (CNNs), effectively detect such diseases by analyzing spatial patterns in images. Hybrid models that combine CNNs with LSTM, GRU, and Bi-LSTM architectures enhance predictions by capturing both spatial features and the progression of symptoms over time, enabling more robust and comprehensive disease detection.

Classification of Potato Disease with Digital Image Processing Technique: A Hybrid Deep Learning Framework. Fatema Tuj Johora Faria, Mukaffi Bin Moin, Ahmed Al Wase, Md Rabius Sani, Khan Md Hasib, and Mohammad Shafiul Alam. 2023 IEEE 13th Annual Computing and Communication Workshop and Conference (CCWC). [Paper]

Mukaffi Bin Moin

Research

1. Explainable AI in Medical Image Analysis

Related Paper:

2. Multimodal Deep Learning

Related Paper:

3. Sentiment Analysis and Assessing the Level of Toxicity in Social Media

Related Paper:

4. Natural Language Inference

Related Paper:

5. Text Generation in Bengali

Related Paper:

Ongoing Work:

6. Image-to-Text Generation

Ongoing Work:

7. Natural Language Processing for Medical Question Answering

Ongoing Work:

8. Machine Translation and Regional Dialect Detection

Related Paper:

9. Generative Adversarial Networks in Agriculture

Related Paper:

10. Computer Vision Applications in Agriculture

Related Paper: