Hello, World!

some years have passed, though

Hi 👋 My name is Andrea Pedrotti and I am currently a Post-Doc at the Institute of Information Science and Technologies of the National Council of Research (ISTI-CNR). Here, I am a member of the AI4Text Group in the Artificial Intelligence for Media and Humanities laboratory. I have obtained my PhD in Computer Science from the University of Pisa, with the thesis Heterogeneous Transfer Learning in Natural Language Processing.

My primary interests lie in representation learning, both at large and small scale, with a particular interests in heterogeneous transfer learning, I like experimenting with different domains such as languages and perceptual modalities within multimodal and multilingual settings.

I have received my master’s degree in Digital Humanities at the University of Pisa with a thesis on Cross-Lingual Text-Classification, supervised by Dr. Fabrizio Sebastiani and Dr. Alejandro Moreo.

During 2022, I have spent a few months at the HD NLP Group of Heidelberg University under the supervision of Prof. Dr. Anette Frank working on the assessment of Video-and-Language Models.

In 2023, I have visited the Language Technology Lab at Cambridge University, where I have worked on the interplay between multi-language and multi-modal models together with Dr. Ivan Vulić.

Check out my Google Scholar profile to find out the latest updates on my research.

For any question, feel free to drop me an email at andrea.pedrotti@isti.cnr.it or connect on LinkedIn 🙃

Recent Publications:

How Humans and LLMs Organize Conceptual Knowledge: Exploring Subordinate Categories in Italian. Andrea Pedrotti, Giulia Rambelli, Caterina Villani, and Marianna Bolognesi. 2025. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025).
Stress-testing Machine Generated Text Detection: Shifting Language Models Writing Style to Fool Detectors. Andrea Pedrotti, Michele Papucci, Cristiano Ciaccio, Alessio Miaschi, Giovanni Puccetti, Felice Dell’Orletta, and Andrea Esuli. 2025. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Findings of ACL 2025).
Multimodal Heterogeneous Transfer Learning for Multilingual Image-Text Classification. Andrea Pedrotti, Alejandro Moreo, and Fabrizio Sebastiani. 2024. Late-Breaking Contributions at the 27th International Conference on Discovery Science (DS 2024).
ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models. Ilker Kesen, Andrea Pedrotti, Moustafa Dogan et al. 2023. In Proceedings of the 12th International Conference on Learning Representations (ICLR 2024)
Generalized Funnelling: Ensemble Learning and Heterogeneous Document Embeddings for Cross-Lingual Text Classification. Alejandro Moreo, Andrea Pedrotti, and Fabrizio Sebastiani. 2022. ACM Transactions on Information Systems 41 (TOIS)
Heterogeneous Document Embeddings for Cross-Lingual Text Classification. Alejandro Moreo, Andrea Pedrotti, and Fabrizio Sebastiani. 2021. In Proceedings of the 36th Annual ACM Symposium on Applied Computing (SAC 2021)

Talks & Seminars

First CNR-DFKI workshop on AI Technologies where I have represented the AI4Text group talking about our current research directions and discussed the future scenarios of the NLP field.

Seminar Series Cambridge, “ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models”

Colloquium Talk Heidelberg, “What’s In An Action? A benchmark for Video-and-Language models through the lens of change-of-state verbs”