Prè-requis
Basic linear algebra, calculus, probability theory
Objectif du cours
Speech and natural language processing is a subfield of artificial intelligence used in an increasing number of applications. This course will provide an overview and details of techniques and tasks used in the automatic processing of text and speech, covering certain history aspects of the field, the representation of textual and speech data, language modelling, machine translation, sentiment analysis and other labelling tasks, chatbots and speech synthesis and recognition. The aim is to provide the key principles, algorithms and mathematical principles behind the state of the art, and confronting them with the reality of processing real data.
En savoir plus : https://github.com/rbawden/MVA_2024_SL/
Organisation des séances
The courses consist in 7 three-hours slots.
Each three-hour slot will have a lecture lasting approximately two hours, followed by a quiz and Q&As.
Mode de validation
Evaluation consists of 2 parts:
- Quizzes (30% of the total grade): You’ll be given a link to an online questionnaire (google form) and will have 30 minutes to complete the questionnaire, which will be activated exactly at 6:00pm and closed down at a time decided on-the-fly by the professors, generally 6:30pm. Any forms submitted after the deadline will be automatically rejected and graded as zero. The quizzes will contain comprehension questions and the best 5 grades out of the 6 quizzes will be used for the average. Between 6:30 and 7:00 there will be a Q&A period where you’ll be able to ask questions about the course and quiz.
- Final exam (70% of the total grade): This year (due to time constraints), there will be a final written exam, with theory questions covering topics covered in lectures.
Références
The recommended, but not obligatory textbook for the course is D. Jurafsky & J. Martin – Speech and Language Processing, 3rd (online) edition for already available chapters [J&M3], 2nd edition otherwise [J&M2]. Readings for each of the sessions will be provided by the instructors.
Thèmes abordés
Topics:
- speech features & signal processing
- hidden markov & finite state modeling
- word embeddings
- deep learning for NLP (RNNs, transformers)
- neural language modelling, including large language models (LLMs)
- machine translation
- sentiment analysis
- sequence labelling tasks
- chatbots
- evaluation: comparing human and machine performance
- speech synthesis and speech recognition
En savoir plus
Chloé Clavel
(INRIA)
Benoit Sagot
(INRIA)
Emmanuel Dupoux
(INRIA)
Rachel Bawden
(INRIA)