Objectif du cours
The ALTEGRAD course ( 28 hours) aims at providing an overview of state-of-the-art ML and AI methods for text and graph data with a significant focus on applications. Each session will comprise two hours of lecture followed by two hours of programming sessions.
Grading for the course will be based on a final data challenge plus lab based evaluation.
TO ENROL – IMPORTANT !
Inscription to the course necessary : link to form
Course web page: here
Informative video: here
Course Syllabus 2021-2022 :
- Graph-of-words GoWvis
- Keyword extraction (TFIDF, TextRank, ECIR’15, EMNLP’16)
- extractive summarization (EMNLP’17)
- Sub-event detection in twitter streams (ICWSM’17)
- graph based document classification: TW-IDF (ASONAM’15), TW-ICW, subgraphs (ACL’15)
- abstractive summarization – ACL 2018 summarization
1.2 TEXT – NLP – Word & doc embeddings (P)
- Word embeddings: word2vec-glove models, doc2vec, subword, Latent Semantic Indexing, context based embeddings
- doc similarity metrics: Word Mover’s distance, shortest path kernels (EMNLP16)
1.3 Deep learning for NLP
- CNNs, RNNs LSTMs for NLP, text classification
- Meta-architectures
- Sequence to Sequence: Attention (HAN),
- Domains: summarization.
- Translation, image captioning
- Domains: summarization.
- Unsupervised word sense detection/disambiguation
- French Lingusitic resources: http://master2-bigdata.polytechnique.fr/FrenchLinguisticResources/
Course Syllabus 2020-2021 :
1.4 Graph kernels, community detection
Grakel python library: https://github.com/ysig/GraKeL/
1.5 Deep Learning for Graphs – node classification
- node embeddings (deepwalk & node2vec) for node classification and link prediction
- Supervised node embeddings (GCNN, …)
1.6 Deep Learning for Graphs – Graph classification, GNNs
- graph CNNs
- message passing
- Graph – Auto-encoders
1.7 Sets embeddings – point clouds
1.8 Network Architecture Search – interpretability.
Michalis Vazirgiannis
(Polytechnique)