Text Mining

Delivery institution

Faculty of Informatics
Data Science and Engineering Department

Instructor(s):

Zakarya Farou

Start date

13 September 2026

End date

18 December 2026

Study field

CHARM priority field

Study level

Study load, ECTS

3

Short description

– Introduction to Text Mining (NLP)
– Text Representation
– Language modeling
– Text classification
– neural network for Text
– Attention-based language modeling

Full description

https://neptun.elte.hu/MobilityCourses?Faculty=&Programme=&AcademicTerm=&Published=&SearchText=text+mining

Learning outcomes

At the end of the course, the learner will be able to understand how current technologies for data analysis and modelling operate and apply them to real-life scenarios, including those involving large volumes of data.

The learner will be familiar with techniques for storing, processing, and visualising large datasets, as well as with the characteristics of different tool ecosystems.

The learner will understand the main application areas of data science, the associated challenges, possible solutions, and the limitations of related methods and techniques.

The learner will be able to identify relationships between different types of data, extract meaningful information, and solve problems through data transformation in multidisciplinary contexts.

Course requirements

No specific pre-requisites are required for this course.

Places available

30

Course literature (compulsory or recommended):

– D. Jurafsky, J. H. Martin. Speech and Language Processing: An Introduction to Natural Language Processing, Speech Recognition, and Computational Linguistics (2nd ed.), Prentice-Hall, 2009.
– C. Manning and H. Schütze, Foundations of Statistical Natural Language Processing, MIT Press. Cambridge, MA: May 1999.

Planned educational activities and teaching methods:

The course will combine lectures, practical demonstrations, hands-on exercises, case studies, and project-based learning. Teaching activities will focus on introducing key concepts, applying data analysis and modelling techniques to real-life examples, and encouraging learners to interpret results and solve problems using data. Individual and group activities may be included to support active learning, discussion, and multidisciplinary collaboration.

Course code

IPM-24ATTME

Language

Assessment method

Presentation

Final certification

Transcript of records

No additional certificate is delivered.

Assessment date

17 January 2027

Modality

Learning management System in use

Canvas, Moodle, Microsoft Teams

Contact hours per week for the student:

2

Specific regular weekly teaching day/time

Monday/10:00-12:00

Time zone