About me

I currently serve as a Postdoctoral Researcher at the Institute for the Future of Education (IFE), Tecnológico de Monterrey, Mexico. My role at IFE is centered around leading interdisciplinary projects that explore the convergence of Natural Language Processing and Education.

I earned my Doctor of Philosophy (PhD) in Computer Science from the Centro de Investigación en Computación, Instituto Politécnico Nacional (IPN). My doctoral research focused on "Automatic Personality and Behavior Detection in Texts Using Deep Learning" under the guidance of Dr. Alexander Gelbukh and Dr. Grigori Sidorov. I hold a master's degree with honors in Computer Science from the Centro de Investigación en Computación, Instituto Politécnico Nacional (IPN). Additionally, I earned a bachelor's degree from Forman Christian University in Lahore, with a double specialization in Software Engineering and Computer Sciences.

My expertise lies in the field of Natural Language Processing (NLP), and my broad research interests encompass areas such as Personality and Emotion Detection, Low Resource Languages, and Question Answering. I am dedicated to leveraging machine and deep learning techniques to develop computing solutions that provide deeper insights into human behavior.

In my free time, I love to work out, cook, travel, and read. I'm particularly passionate about tea, savoring its diverse flavors and the art of brewing the perfect cup. If you ever find yourself in my company, I'd be delighted to invite you for a cup of tea and some great conversation.

Publications

 

  • 2023
  • Butt, S., Mejía-Almada, P., Alvarado-Uribe, J., Ceballos, H. G., Sidorov, G., & Gelbukh, A. (2023, October). MF-SET: A Multitask Learning Framework for Student Evaluation of Teaching.In Proceedings of the Future Technologies Conference (pp. 254-270). Cham: Springer Nature Switzerland. Link

    Gallardo, K., Butt, S., & Ceballos, H. (2023, June). Improvement of Teaching Competencies Training in Higher Education Faculty Based on Student Evaluations of Teaching and AI Systems. In International Conference in Information Technology and Education (pp. 555-563). Singapore: Springer Nature Singapore.Link

    Balouchzahi, F., Butt, S., Sidorov, G., & Gelbukh, A. (2023). ReDDIT: Regret detection and domain identification from text. Expert Systems with Applications, 225, 120099. Link

    Sidorov, G., Balouchzahi, F., Butt, S., & Gelbukh, A. (2023). Regret and Hope on Transformers: An Analysis of Transformers on Regret and Hope Speech Detection Datasets. Applied Sciences, 13(6), 3983. Link

  • 2022
  • Butt, S., Amjad, M., Balouchzahi, F., Ashraf, N., Sharma, R., Sidorov, G., & Gelbukh, A. (2022, December). EmoThreat@ FIRE2022: Shared Track on Emotions and Threat Detection in Urdu. In Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation (pp. 1-3). Link

    Balouchzahi, F., Butt, S., Hegde, A., Ashraf, N., Shashirekha, H. L., Sidorov, G., & Gelbukh, A. (2022, December). Overview of CoLI-Kanglish: Word Level Language Identification in Code-mixed Kannada-English Texts at ICON 2022. In Proceedings of the 19th International Conference on Natural Language Processing (ICON) (pp. 38-45). Link

    Butt, S., Amjad, M., Balouchzahi, F., Ashraf, N., Sharma, R., Sidorov, G., & Gelbukh, A. (2022, April). Overview of EmoThreat: Emotions and Threat Detection in Urdu at FIRE 2022. In Proceedings of the CEUR Workshop Proceedings, Chennai, India (pp. 22-24). Link

    Amjad, M., Butt, S., Zhila, A., Sidorov, G., Chanona-Hernandez, L., & Gelbukh, A. (2022). Survey of Fake News Datasets and Detection Methods in European and Asian Languages. Acta Polytechnica Hungarica, 19(10), 185-204. Link

    Butt, S., Sharma, S., Sharma, R., Sidorov, G., & Gelbukh, A. (2022). What goes on inside rumour and non-rumour tweets and their reactions: A Psycholinguistic Analyses. Computers in Human Behavior, 107345. Link

    Balouchzahi, F., Butt, S., Sidorov, G., & Gelbukh, A. (2022, May). CIC@ LT-EDI-ACL2022: Are transformers the only hope? Hope speech detection for Spanish and English comments. In Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion (pp. 206-211). Link

    Ashraf, N., Khan, L., Butt, S., Chang, H. T., Sidorov, G., & Gelbukh, A. (2022). Multi-label emotion classification of Urdu tweets. PeerJ Computer Science, 8, e896. Link

    Butt, S., Balouchzahi, F., Sidorov, G., & Gelbukh, A. (2022). CIC@ PAN: Simplifying Irony Profiling using Twitter Data. In CLEF (pp. 1613-0073). Link

    Ashraf, N., Rafiq, A., Butt, S., Shehzad, H. M. F., Sidorov, G., & Gelbukh, A. (2022). YouTube based religious hate speech and extremism detection dataset with machine learning baselines. Journal of Intelligent & Fuzzy Systems, (Preprint), 1-9. Link

    Amjad, M., Butt, S., Amjad, H. I., Zhila, A., Sidorov, G., & Gelbukh, A. (2022). Overview of the shared task on fake news detection in Urdu at Fire 2021.Forum for Information Retrieval Evaluation. Link

    Amjad, M., Butt, S., Amjad, H. I., Zhila, A., Sidorov, G., & Gelbukh, A. (2022).Overview of Abusive and Threatening Language Detection in Urdu at FIRE Forum for Information Retrieval Evaluation. Link

  • 2021
  • Hoang, TT., Butt, S., Angel, J., Sidorov, G., & Gelbukh, A (2021) The Combination of BERT and Data Oversampling for Relation Set Prediction. International Semantic Web Conference (pp. 22-33) Link

    Amjad, M., Zhila, A., Sidorov, G., Labunets, A., Butt, S., Amjad, H. I., ... & Gelbukh, A. (2021, December). UrduThreat@ FIRE2021: Shared Track on Abusive Threat Identification in Urdu. In Forum for Information Retrieval Evaluation (pp. 9-11). Link

    Ashraf N., Butt S., Sidorov G., Gelbukh A. (2021) CIC at CheckThat! 2021: Fake News detection Using Machine Learning And Data Augmentation In Working Notes of CLEF 2021---Conference and Labs of the Evaluation Forum, Bucharest, Romania (online). Link

    Butt S., Ashraf N., Sidorov G., Gelbukh A. (2021) Sexism Identification using BERT and Data Augmentation - EXIST2021 In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2021), CEUR Workshop Proceedings. Link

    Butt S., Siddiqui M.H.F., Ashraf N., Sidorov G., Gelbukh A. (2021) Transformer-Based Extractive Social Media Question Answering on TweetQA. In Computación y Sistemas. 25(1). Link

  • 2018
  • Rehmani T., Butt S., Baig I.R., Malik Z.H., Ali M. (2018) Designing Robot Receptionist for Overcoming Poor Infrastructure, Low Literacy and Low Rate of Female Interaction. In Companion of the 2018 ACM/IEEE International Conference on Human-Robot Interaction (HRI '18). Association for Computing Machinery, New York, NY, USA, 211–212. Link

    Malik Z.H., Butt S., Sajid H. (2019) Quality Scale for Rubric Based Evaluation in Capstone Project of Computer Science. In: Arai K., Kapoor S., Bhatia R. (eds) Intelligent Computing. SAI 2018. Advances in Intelligent Systems and Computing, vol 857. Springer, Cham. Link

Datasets

 

EmoThreat: Dataset for Multi-label Emotion Classification in Urdu Link

EmoThreat: Dataset for Threatening Language Detection Task in Urdu Link

CoLI-Kanglish: Dataset for Word Level Language Identification in Code-mixed Kannada-English Texts Link

ReDDIT: Dataset for Regret Detection and Domain Identification from English Texts Link (Email the corresponding author)

UrduFake: Dataset for Urdu Fake News named Bend-The-Truth Link

UrduThreat: Dataset for Abusive language using Twitter tweets in Urdu language Link

Dataset for YouTube Based Religious Hate Speech and Extremism Detection from English Texts Link

Contact

saburb@tec.mx
saboor.butt2
Or you can use this form depending on your convenience