I currently serve as a Postdoctoral Researcher at the Institute for the Future of Education (IFE), Tecnológico de Monterrey, Mexico. My role at IFE is centered around leading interdisciplinary projects that explore the convergence of Natural Language Processing and Education.
I earned my Doctor of Philosophy (PhD) in Computer Science from the Centro de Investigación en Computación, Instituto Politécnico Nacional (IPN). My doctoral research focused on "Automatic Personality and Behavior Detection in Texts Using Deep Learning" under the guidance of Dr. Alexander Gelbukh and Dr. Grigori Sidorov. I hold a master's degree with honors in Computer Science from the Centro de Investigación en Computación, Instituto Politécnico Nacional (IPN). Additionally, I earned a bachelor's degree from Forman Christian University in Lahore, with a double specialization in Software Engineering and Computer Sciences.
My expertise lies in the field of Natural Language Processing (NLP), and my broad research interests encompass areas such as Personality and Emotion Detection, Low Resource Languages, and Question Answering. I am dedicated to leveraging machine and deep learning techniques to develop computing solutions that provide deeper insights into human behavior.
In my free time, I love to work out, cook, travel, and read. I'm particularly passionate about tea, savoring its diverse flavors and the art of brewing the perfect cup. If you ever find yourself in my company, I'd be delighted to invite you for a cup of tea and some great conversation.
Butt, S., Mejía-Almada, P., Alvarado-Uribe, J., Ceballos, H. G., Sidorov, G., & Gelbukh, A. (2023, October). MF-SET: A Multitask Learning Framework for Student Evaluation of Teaching.In Proceedings of the Future Technologies Conference (pp. 254-270). Cham: Springer Nature Switzerland. Link
Gallardo, K., Butt, S., & Ceballos, H. (2023, June). Improvement of Teaching Competencies Training in Higher Education Faculty Based on Student Evaluations of Teaching and AI Systems. In International Conference in Information Technology and Education (pp. 555-563). Singapore: Springer Nature Singapore.Link
Balouchzahi, F., Butt, S., Sidorov, G., & Gelbukh, A. (2023). ReDDIT: Regret detection and domain identification from text. Expert Systems with Applications, 225, 120099. Link
Sidorov, G., Balouchzahi, F., Butt, S., & Gelbukh, A. (2023). Regret and Hope on Transformers: An Analysis of Transformers on Regret and Hope Speech Detection Datasets. Applied Sciences, 13(6), 3983. Link
Butt, S., Amjad, M., Balouchzahi, F., Ashraf, N., Sharma, R., Sidorov, G., & Gelbukh, A. (2022, December). EmoThreat@ FIRE2022: Shared Track on Emotions and Threat Detection in Urdu. In Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation (pp. 1-3). Link
Balouchzahi, F., Butt, S., Hegde, A., Ashraf, N., Shashirekha, H. L., Sidorov, G., & Gelbukh, A. (2022, December). Overview of CoLI-Kanglish: Word Level Language Identification in Code-mixed Kannada-English Texts at ICON 2022. In Proceedings of the 19th International Conference on Natural Language Processing (ICON) (pp. 38-45). Link
Butt, S., Amjad, M., Balouchzahi, F., Ashraf, N., Sharma, R., Sidorov, G., & Gelbukh, A. (2022, April). Overview of EmoThreat: Emotions and Threat Detection in Urdu at FIRE 2022. In Proceedings of the CEUR Workshop Proceedings, Chennai, India (pp. 22-24). Link
Amjad, M., Butt, S., Zhila, A., Sidorov, G., Chanona-Hernandez, L., & Gelbukh, A. (2022). Survey of Fake News Datasets and Detection Methods in European and Asian Languages. Acta Polytechnica Hungarica, 19(10), 185-204. Link
Butt, S., Sharma, S., Sharma, R., Sidorov, G., & Gelbukh, A. (2022). What goes on inside rumour and non-rumour tweets and their reactions: A Psycholinguistic Analyses. Computers in Human Behavior, 107345. Link
Balouchzahi, F., Butt, S., Sidorov, G., & Gelbukh, A. (2022, May). CIC@ LT-EDI-ACL2022: Are transformers the only hope? Hope speech detection for Spanish and English comments. In Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion (pp. 206-211). Link
Ashraf, N., Khan, L., Butt, S., Chang, H. T., Sidorov, G., & Gelbukh, A. (2022). Multi-label emotion classification of Urdu tweets. PeerJ Computer Science, 8, e896. Link
Butt, S., Balouchzahi, F., Sidorov, G., & Gelbukh, A. (2022). CIC@ PAN: Simplifying Irony Profiling using Twitter Data. In CLEF (pp. 1613-0073). Link
Ashraf, N., Rafiq, A., Butt, S., Shehzad, H. M. F., Sidorov, G., & Gelbukh, A. (2022). YouTube based religious hate speech and extremism detection dataset with machine learning baselines. Journal of Intelligent & Fuzzy Systems, (Preprint), 1-9. Link
Amjad, M., Butt, S., Amjad, H. I., Zhila, A., Sidorov, G., & Gelbukh, A. (2022). Overview of the shared task on fake news detection in Urdu at Fire 2021.Forum for Information Retrieval Evaluation. Link
Amjad, M., Butt, S., Amjad, H. I., Zhila, A., Sidorov, G., & Gelbukh, A. (2022).Overview of Abusive and Threatening Language Detection in Urdu at FIRE Forum for Information Retrieval Evaluation. Link
Hoang, TT., Butt, S., Angel, J., Sidorov, G., & Gelbukh, A (2021) The Combination of BERT and Data Oversampling for Relation Set Prediction. International Semantic Web Conference (pp. 22-33) Link
Amjad, M., Zhila, A., Sidorov, G., Labunets, A., Butt, S., Amjad, H. I., ... & Gelbukh, A. (2021, December). UrduThreat@ FIRE2021: Shared Track on Abusive Threat Identification in Urdu. In Forum for Information Retrieval Evaluation (pp. 9-11). Link
Ashraf N., Butt S., Sidorov G., Gelbukh A. (2021) CIC at CheckThat! 2021: Fake News detection Using Machine Learning And Data Augmentation In Working Notes of CLEF 2021---Conference and Labs of the Evaluation Forum, Bucharest, Romania (online). Link
Butt S., Ashraf N., Sidorov G., Gelbukh A. (2021) Sexism Identification using BERT and Data Augmentation - EXIST2021 In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2021), CEUR Workshop Proceedings. Link
Butt S., Siddiqui M.H.F., Ashraf N., Sidorov G., Gelbukh A. (2021) Transformer-Based Extractive Social Media Question Answering on TweetQA. In Computación y Sistemas. 25(1). Link
Rehmani T., Butt S., Baig I.R., Malik Z.H., Ali M. (2018) Designing Robot Receptionist for Overcoming Poor Infrastructure, Low Literacy and Low Rate of Female Interaction. In Companion of the 2018 ACM/IEEE International Conference on Human-Robot Interaction (HRI '18). Association for Computing Machinery, New York, NY, USA, 211–212. Link
Malik Z.H., Butt S., Sajid H. (2019) Quality Scale for Rubric Based Evaluation in Capstone Project of Computer Science. In: Arai K., Kapoor S., Bhatia R. (eds) Intelligent Computing. SAI 2018. Advances in Intelligent Systems and Computing, vol 857. Springer, Cham. Link
EmoThreat: Dataset for Multi-label Emotion Classification in Urdu Link
EmoThreat: Dataset for Threatening Language Detection Task in Urdu Link
CoLI-Kanglish: Dataset for Word Level Language Identification in Code-mixed Kannada-English Texts Link
ReDDIT: Dataset for Regret Detection and Domain Identification from English Texts Link (Email the corresponding author)
UrduFake: Dataset for Urdu Fake News named Bend-The-Truth Link
UrduThreat: Dataset for Abusive language using Twitter tweets in Urdu language Link
Dataset for YouTube Based Religious Hate Speech and Extremism Detection from English Texts Link