CS 188: Natural Language Processing — Winter 2022
Prof. Nanyun (Violet) Peng

Announcements | Course Information | Schedule


Announcements


Course Information

Course objectives: Welcome! This course is designed to introduce you to some of the problems and solutions of NLP, and their relation to machine learning, statistics, linguistics, and social sciences. You need to know how to program and use common data structures.
It might also be nice—though it's not required—to have some previous familiarity with linear algebra and probabilities.
At the end you should agree (I hope!) that language is subtle and interesting, feel some ownership over some of NLP's techniques, and be able to understand research papers in the field.

Lectures:M/W 12:00 - 1:50pm
Location: Royce Hall 190.
Prof:Nanyun (Violet) Peng Email: violetpeng@cs.ucla.edu
TAs: Te-Lin Wu Email: telinwu@g.ucla.edu
Mingyu Derek Ma Email: ma@cs.ucla.edu
Office hrs: Prof: Mon. 11:00am at Eng VI 397A; or zoom: link
TAs:
Wu: Thr. 1:00pm - 2:00pm at Eng VI 389, or zoom: link
Ma: Tue. 1:00pm - 2:00pm at Eng VI 389, or zoom: link
TA sessions: Wu: Fri. 12:00pm - 1:50pm at Renee and David Kaplan Hall 169, or zoom: link
Ma: Fri. 2:00pm - 3:50pm at Royce Hall 190, or zoom: link
Discussion site: Piazza https://piazza.com/class/kxs66p57qcpw5
... public questions, discussion, announcements
Web page:https://vnpeng.net/cs188_win22.html
Textbook: Jurafsky & Martin, 3rd ed. (recommended)
Manning & Schütze (recommended)
Policies: Grading: homework 30%, project 20%, midterm 20%, final 25%, participation 5%
Honesty: UCLA Student Conduct Code


Schedule

Warning: The schedule below may change. Links to future lectures and assignments are just placeholders and will not be available until shortly before or after the actual lecture.


Week Monday Wednesday Friday (TA sessions) Suggested Reading
1/3 Introduction
  • Why is NLP hard? What's important?
  • Levels of language
  • NLP applications
  • Text classification and lexical semantics
  • Text classification
  • Naive Bayes classifier
  • Logistic Regression
  • Semantic phenomena and representations
  • WordNet
  • Review of linear algebra and calculus
  • Intro: J&M chapter 1
  • Chomsky hierarchy: J&M 16
  • Prob/Bayes: M&S 2
  • 1/10 Project description out
    Distributional semantics
  • Word-Document Matrix
  • LSA
  • Semantic Similarity
  • Word Vectors
  • N-gram language models
  • How to model language?
  • What's wrong with n-grams?
  • What do language models model?
  • Data preparation and ML practice
  • Overview of ML system components
  • Language models: J&M 3
  • 1/17 Assignment 1 release
    No lecture (MLK holiday)
    Smoothing n-grams
  • Add-one or add-λ smoothing
  • Cross-validation
  • Smoothing with backoff
  • Log-linear models
  • Intro to google cloud computing
  • Intro to colab
  • Smoothing: J&M 3; Rosenfeld (2000)
  • 1/24 Intro to neural language models
  • Conditional log-linear models
  • Maximum likelihood, regularization
  • Feedforward neural language Models
  • Recurrent neural language models
  • Recurrent neural networks (RNNs)
  • Long-short term memory networks (LSTMs)
  • Deep learning workshop
  • Intro to PyTorch
  • Neural language models: J&M 7
  • OpenAI blog post GPT-2 (with paper)
  • 1/31 Assignment 1 due
    Transformers and Masked Language Models
  • The transformer model
  • Masked languge models -- BERT
  • Project midterm report due
    Syntax
  • Word segmentation
  • Chunking
  • Part-of-speech tagging
  • Assignment 1 answer keys release
  • Neural network basics
  • PyTorch tutorial
  • Transformer paper ; BERT paper
  • 2/7 Midterm exam
    (12:00-1:50pm in class)

    Return assignment 1 gradings
    Sequence tagging models
  • Hidden Markov Model (HMMs)
  • Maximum Entropy Markov Models (MEMMs)
  • Return project feedbacks
  • More Pytorch tutorial
  • Intro to Huggingface
  • The Viterbi Algorithm: J&M 8
  • Hidden Markov Models: J&M Appendix A;
  • 2/14 Assignment 2 release
    Sequence tagging models (cont.)
  • Conditional Random Fields (CRFs)
  • Neural CRFs (if time permits)
  • Named Entity Recognition
  • Intro to NER
  • Sequence tagging models
  • Nested NERs
  • Return midterm gradings
  • Deep dive of Huggingface tools
  • John Lafferty's paper on CRF
  • 2/21 No lecture (Presidents' Day) Probabilistic parsing
  • What is parsing?
  • Why is it useful?
  • Brute-force algorithm
  • CKY algorithms
  • PCFG parsing
  • Review midterm exam
  • Attributes: J&M 12
  • Parsing: J&M 13
  • 2/28 Assignment 2 due
    Dependency Parser
  • Dependency grammar
  • Dependency trees
  • Dependency Parser (Cont.)
  • Shift-reduce parser
  • Final exam recitation
  • CCG: Steedman & Baldridge; more
  • TAG/TSG: Van Noord, Guo, Zhang 1/2/3
  • Prob. parsing: J&M 14
  • 3/7 Return ssignment 2 gradings
    Intro to Machine Translation
  • introduction
  • history
  • evaluation
  • sequence-to-sequence models
  • Socially responsible NLP
  • Biases in word embeddings.
  • Biases in coreference resolution.
  • Bias-mitigation algorithms
  • Project final report due
  • Final exam recitation
  • MT: J&M 25, M&S 13, statmt.org; tutorial (2003), workbook (1999), introductory essay (1997), technical paper (1993); tutorial (2006) focusing on more recent developments (slides, 3-hour video part 1, part 2)