Jan Šnajder

Jan Šnajder Jan Šnajder, PhD
Associate Professor

Text Analysis and Knowledge Engineering Lab
Faculty of Electrical Engineering and Computing
University of Zagreb, Croatia

Phone: +385 1 6129 871
Email: jan (dot) snajder (at) fer (dot) hr

LinkedIn   Scholar

I am an Associate Professor at the Faculty of Electrical Engineering and Computing (FER) at the University of Zagreb and a member of Text Analysis and Knowledge Engineering Lab (TakeLab). My research interests are in natural language processing (NLP), machine learning, and language technologies. My current focus is on lexical semantics, information extraction, and opinion mining. I am a fan of functional programming, Haskell in particular.

Short Bio

I received my MSc and PhD degrees in Computer Science from the University of Zagreb, Faculty of Electrical Engineering and Computing (UNIZG FER), Zagreb, Croatia in 2006 and 2010, respectively. From 2002 I was working as a research assistant and from 2016 I am working as an Associate Professor at UNIZG FER. In 2012 and 2013 I was a visiting researcher at the Department of Computational Linguistics at Heidelberg University. In 2015 I was a visiting researcher at the NICT in Kyoto, and in 2014 and 2015 a visiting researcher at the IMS, Stuttgart University. In 2016 I was a visiting researcher at the Department of Computing and Information Systems, University of Melbourne.

Curriculum vitae














2007 –

Software & Data


Current PhD Students

  1. Domagoj Alagić - word sense modelling
  2. Filip Boltužić - argumentation mining
  3. Matej Gjurković - author profiling
  4. Damir Korenčić (with Dr. Strahil Ristov) - topic modelling
  5. Milan Pavlović - threat detection
  6. Tamara Sladoljev-Agejev (with Prof. Svjetlana Kolić-Vehovec) - discourse analysis
  7. Martin Tutek - distributional semantics

Current MA Students

  1. Jure Baban
  2. Ana Brassard
  3. Luka Dulčić
  4. Bartol Freškura
  5. Bruno Gavranović
  6. Martin Gluhak
  7. Viktor Golem
  8. Paula Gombar
  9. Robert Injac
  10. Marin Kačan
  11. Tin Kuculo
  12. Marin Kukovačec
  13. Toni Kukurin
  14. David Lozić
  15. Luka Markušić
  16. Stipan Mikulić
  17. Ivan Mršić
  18. Mihael Nikić
  19. Ivan Paljak
  20. Lukrecija Puljić
  21. Tome Radman
  22. Ivan Sekulić
  23. Filip Šaina
  24. Antonio Šajatović
  25. Doria Šarić
  26. Fredi Šarić

Current BA Students

  1. Ivan Crnomarković
  2. Fabijan Čorak
  3. Ivan Dujmić
  4. Fran Grgić
  5. Mihovil Ilakovac
  6. Vinko Kašljević
  7. Roman Kerčmar
  8. Mate Mijolović
  9. Gregor Orlić
  10. Josip Torić

Completed PhD Theses

  1. Mladen Karan. Computer-Aided Construction and Sematnic Search of Question and Answer Collections. 2017.
  2. Vedran Galetić (with Prof. Marijan Palmović). Formalization and Quantification of a Cognitively Motivated Conceptual Space Model Based on The Prototype Theory. 2016.
  3. Goran Glavaš. Text Information and Retrieval Based on Event Graphs. 2014.

Completed MA Theses

  1. Filip Čulinović. Evaluating Croatian Language Word Representations. 2017.
  2. Tomislav Marinković. Deep Learning Models for the Analysis of User Comments on Social Networks. 2017.
  3. Matej Paradžik. Domain Adaptation for Sentiment Analysis from Text. 2017.
  4. Tena Perak. Semantic Analysis of Math Word Problems. 2017.
  5. Luka Skukan. Application of Compositional Distributional Semantics for Semantic Text Similarity. 2017.
  6. Maja Buljan. Multiword Identification Based on the Combination of Linguistic Features. 2016.
  7. Vjeran Crnjak. Learning to Search for Solving Natural Language Processing Tasks. 2016.
  8. Zoran Medić. Compositional Distributional Semantics Based on the Lexical Function Model. 2016.
  9. Dino Radaković. A Joint Model for Named Entity Relation Extraction. 2016.
  10. Sven Vidak. Deep Learning for Language Modeling of the Croatian Language. 2016.
  11. Toni Antunović. Automated Extraction of Bilingual Lexicons Based on Semantic Vector Spaces. 2015.
  12. Krešimir Baksa. Shallow Semantic Parsing of Croatian Texts. 2015.
  13. Dino Dolović. Sentiment Analysis in Tweets in Croatian Language. 2015.
  14. Goran Gašić. Deep Learning of Word Embeddings for Tagging Models for Croatian Texts. 2015.
  15. Lana Lisjak. Recognizing Textual Entailment in Croatian Texts. 2015.
  16. Hermina Petric Maretić. Project Proposals Analysis using Statistical Natural Language Processing. 2015.
  17. Mihael Šafarić. Feature Selection and Document Representation Methods for Text Classification. 2015.
  18. Petra Almić. A Model for Determining Semantic Compositionality of Croatian Multi-Word Expressions. 2014.
  19. Marko Bekavac. Word Sense Induction and Discrimination Model for Croatian Words. 2014.
  20. Petra Bevandić. Optimizing Dependency Parsing Parameters for Croatian Language. 2014.
  21. Siniša Biđin. Using Deep Learning for Sentiment Analysis of Croatian Expressions. 2014.
  22. Luka Krajcar. Sentiment Analysis of Tweets in Croatian Language. 2014.
  23. Lovro Rožić (with Mladen Vuković). Functional Programming. 2014.
  24. Martin Tutek. Multi-label Document Classification using EuroVoc Thesaurus. 2014.
  25. Leo Zuanović. Recurrent Neural Network Based Model of Croatian Language. 2014.
  26. Filip Petkovski. Application of Partial Membership Models to Keyphrase Extraction from Croatian Documents. 2013.
  27. Tin Franović. Classification of Email Importance Based on Speech Acts. 2013.
  28. Matija Hanževački. Coreference Resolution in Croatian Texts. 2013.
  29. Josip Bakić. Automatic Content Extraction from Web Pages. 2012.
  30. Sonja Grđan. Application of Machine Learning Methods for EEG-Based Brain-Computer Interface. 2012.
  31. Ante Kegalj. Sentiment Analysis Based on Prior Word Polarity. 2012.
  32. Ivan Krišto. Using Machine Learning Methods to Improve Document Retrieval. 2012.
  33. Tomislav Lombarović. Named Entity Recognition and Classification for Text in Croatian Language. 2012.
  34. Mladen Marović. Event and Temporal Relation Extraction in Croatian Language Texts. 2012.
  35. Hrvoje Peradin. Constraint Grammar-based Parsing of Croatian Texts. 2012.
  36. Veljko Srdarević. Text Report Generation Based on Structured Data. 2012.
  37. Fran Dragomanović (with Prof. Bojana Dalbelo Bašić). Acronym Extraction in Croatian Language. 2011.
  38. Zoran Hranj. Unsupervised Coreference Resolution. 2011.
  39. Vedrana Janković. Computational Models of Distributional Lexical Semantics in Croatian Language. 2011.
  40. Ivan Kmetović. Matching Co-referent Named Entities Using Machine Learning. 2011.
  41. Slavko Kručaj. Applying Machine Learning Methods to User Review Summarization. 2011.
  42. Ivan Kusalić. Application of Topic Models to Analysis of Croatian Documents. 2011.
  43. Ognjen Lajšić. Grammar and Style Checker for Croatian Language. 2011.
  44. Vladimir Manzin. Computer Agents for Poker. 2011.
  45. Vjekoslav Osmann. Tagging Parts of Speech in Croatian Texts. 2011.
  46. Paško Pajdek. Deep Generative Models for Semantic Document Clustering. 2011.
  47. Josip Saratlija. Unsupervised Parser for Croatian Language. 2011.
  48. Nikola Šantić. Automatic Paraphrasing of Croatian Expressions and Sentences. 2011.
  49. Matea Biočić (with Prof. Bojana Dalbelo Bašić). Word Sense Discrimination Using Expectation Maximization Algorithm. 2010.
  50. Zlatan Hot (with Prof. Bojana Dalbelo Bašić). A Stemming Algorithm Based on String Clustering. 2010.
  51. Marin Japec (with Prof. Bojana Dalbelo Bašić). System for Organizing and Sharing Knowledge Based on Topic Maps. 2010.
  52. Matija Lacković (with Prof. Bojana Dalbelo Bašić). Program Environment for Execution of Tournaments for Game Playing Algorithms. 2010.
  53. Nikola Novak (with Prof. Bojana Dalbelo Bašić). Implementation of a Game Simulator and Checkers Game-playing Algorithms. 2010.
  54. Ivan Šolta (with Prof. Bojana Dalbelo Bašić). Determining Semantic Orientation of Subjective Words and Phrases. 2010.
  55. Davor Delač (with Prof. Bojana Dalbelo Bašić). Collocation Extraction from Corpus. 2009.
  56. Lovro Žmak (with Prof. Bojana Dalbelo Bašić). FAQ Retrieval System for Croatian Language. 2009.
  57. Srđan Vuković (with Prof. Bojana Dalbelo Bašić). A Heuristic Algorithm for Matching of Address Data. 2008.

Completed BA Theses

  1. Fran-Andrija Arbanas. Question Type Classification for a Natural Language Database Interface. 2017.
  2. Marin Kukovačec. Automated Sarcasm Detection in Social Network Users' Comments. 2017.
  3. Toni Kukurin. Claim and Stance Classification in Online Discussions Using Machine Learning. 2017.
  4. David Lozić. Intrinsic Plagiarism Detection in Student Theses. 2017.
  5. Juraj Malenica. Question Context Prediction for an Interactive Natural Language Database Interface. 2017.
  6. Ivan Mršić. Entity Recognition and Classification for a Natural Language Database Interface. 2017.
  7. Lukrecija Puljić. Author Profiling on Social Networks Using Machine Learning. 2017.
  8. Filip Šaina. Sentiment Summarization from Student Course Questionnaires. 2017.
  9. Antonio Šajatović. Predicting Newsworthiness of News Stories Using Machine Learning. 2017.
  10. Doria Šarić. Computational Analysis of the Similarity of Math Word Problems. 2017.
  11. Ivan Tokić. Cross-Lingual Plagiarism Detection from Wikipedia. 2017.
  12. Bartol Freškura. Application of Deep Learning for Stance Detection in User Comments. 2016.
  13. Bruno Gavranović. Application of Deep Learning for Sentiment Analysis. 2016.
  14. Filip Hrenić. Detection of Inappropriate Messages in Online Chats. 2016.
  15. Marin Kačan. Detecting Lexical Transfer Errors of Second Language Learners. 2016.
  16. Mihael Nikić. Application of Machine Learning for Topic-Based Sentiment Analysis. 2016.
  17. Stipan Mikulić. Use of Distributional Semantic Models in the Word Association Game. 2016.
  18. Filip Čulinović. Acquisition of Verb Classes from Corpus using Unsupervised Machine Learning. 2015.
  19. Paula Gombar. Contextual Sentiment Analysis of Croatian Expressions. 2015.
  20. Ivan Paljak. Stance Classification and Analysis in Online User Comments. 2015.
  21. Ivan Sekulić. Extraction of Semantic Verb Relations from Croatian Corpora. 2015.
  22. Jura Šlosel. Entity-Based Coherence Model for Croatian Texts. 2015.
  23. Vjeran Crnjak. Part-of-Speech Tagging for Croatian using Conditional Random Fields. 2014.
  24. Stjepan Glavina. Machine Learning of Document Classification Rules. 2014.
  25. Zoran Medić. Quotation Extraction from News Stories in Croatian Language. 2014.
  26. Matej Paradžik. Semi-Supervised Acquisition of Sentiment Polarity Lexicon. 2014.
  27. Dino Radaković. Applying Semantic Kernel Functions in Text Classification. 2014.
  28. Luka Skukan. Temporal Expression Tagging for Croatian Texts. 2014.
  29. Sandra Trkulja. Feature Construction and Selection for Document Classification in Croatian Language. 2014.
  30. Sven Vidak. Offensive Text Detection using Machine Learning Methods. 2014.
  31. Ivana Balažević. Document Clustering Using Self-organizing Neural Networks. 2012.
  32. Marko Bekavac. Application of Genetic Programming in Keyphrase Extraction. 2012.
  33. Petra Bevandić. Automatic Natural Language Identification. 2012.
  34. Goran Gašić. Automatic Tagging of Croatian Newswire Articles. 2012.
  35. Luka Krajcar. Error Correction in Texts Produced by Speech Recognition of Croatian. 2012.
  36. Zolik Nemet. Extraction of Acronyms from Corpus of Texts in Croatian Language. 2012.
  37. Roko Pancirov. Automatic Extraction of Bilingual Dictionaries Based on Wikipedia. 2012.
  38. Martin Tutek. Using Wikipedia for Automatic Word Sense Disambiguation. 2012.
  39. Leo Zuanović. Machine Learning of Croatian Lemmatization Rules. 2012.
  40. Siniša Biđin. A Controlled Natural Language Parser. 2011.
  41. Matija Hanževački. Temporal Expression Tagging in Croatian Texts. 2011.
  42. Ante Kegalj (with Prof. Bojana Dalbelo Bašić). Automated Sentence Boundary Detection. 2010.
  43. Tomislav Lombarović (with Prof. Bojana Dalbelo Bašić). Question Type Classification for Information Retrieval Systems. 2010.
  44. Mladen Marović (with Prof. Bojana Dalbelo Bašić). OCR Error Correction. 2010.
  45. Mladen Mikša (with Bojana Dalbelo Bašić). Correction of Merged Words Errors in Texts Obtained by Optical Character Recognition. 2010.
  46. Veljko Srdarević (with Prof. Bojana Dalbelo Bašić). Building a Stemming Algorithm Using Genetic Programming. 2010.
  47. Zoran Hranj (with Prof. Bojana Dalbelo Bašić). Structure-Based Web Page Comparison Algorithm. 2009.
  48. Ivan Karačić (with Prof. Bojana Dalbelo Bašić). Word Sense Discrimination. 2009.
  49. Ivan Kmetović (with Prof. Bojana Dalbelo Bašić). Keyword Extraction from Text Using Decision Trees. 2009.
  50. Ivan Krišto (with Prof. Bojana Dalbelo Bašić). Web Page Cleaning Techniques for Text Mining. 2009.
  51. Ognjen Lajšić (with Prof. Bojana Dalbelo Bašić). OCR Error Correction. 2009.
  52. Josip Saratlija (with Prof. Bojana Dalbelo Bašić). Keyword Extraction Based on Document Clustering. 2009.
  53. Nikola Šantić (with Prof. Bojana Dalbelo Bašić). Automatic Diacritics Restoration in Croatian Texts. 2009.
  54. Igor Šoš (with Prof. Bojana Dalbelo Bašić). Client Side of Distributed Linguistic Resource Annotator. 2009.
  55. Marin Japec (with Prof. Bojana Dalbelo Bašić). Dialogue System in Croatian Language. 2008.
  56. Željko Rumenjak (with Prof. Bojana Dalbelo Bašić). Distributed linguistic resource annotator. 2008.
  57. Ivan Šolta (with Prof. Bojana Dalbelo Bašić). Query Correction Based on Levenshtein Distance. 2008.

Locations of visitors to this page