CS 162: Natural Language Processing — Winter 2023
Prof. Nanyun (Violet) Peng

Announcements | Course Information | Schedule


Announcements


Course Information

Course objectives: Welcome! This course is designed to introduce you to some of the problems and solutions of NLP, and their relation to machine learning, statistics, linguistics, and social sciences. You need to know how to program and use common data structures.
It might also be nice—though it's not required—to have some previous familiarity with linear algebra and probabilities.
At the end you should agree (I hope!) that language is subtle and interesting, feel some ownership over some of NLP's techniques, and be able to understand research papers in the field.

Lectures:M/W 12:00 - 1:50pm
Location: KAPLAN 169.
Prof:Nanyun (Violet) Peng Email: violetpeng@cs.ucla.edu
TAs: Yufei Tian Email: yufeit@cs.ucla.edu
Tanmay Parekh Email: tparekh@cs.ucla.edu
Office hrs: Prof: Mon. 11:00am - 12:00pm at Eng VI 397A; or zoom: link
TAs:
Yufei: Tuesday 11:00am - 12:00pm Eng VI 389; or zoom: link
Tanmay: Thursday 11:00am - 12:00pm Eng VI 389; or zoom: link
TA sessions: Sec 1A: Friday 2:00 - 3:50pm, Haines A25 (Tanmay Parekh)
Sec 1B: Friday 12:00 - 1:50pm, Haines A2 (Yufei Tian)
Discussion site: Piazza https://piazza.com/ucla/winter2023/cs162
... public questions, discussion, announcements
Web page:https://vnpeng.net/cs162_win23.html
Textbook: Jurafsky & Martin, 3rd ed. (recommended)
Manning & Schütze (recommended)
Policies: Grading: homework 35%, project 15%, midterm 20%, final 25%, participation 5%
Honesty: UCLA Student Conduct Code


Schedule

Warning: The schedule below may change. Links to future lectures and assignments are just placeholders and will not be available until shortly before or after the actual lecture.


Week Monday Wednesday Friday (TA sessions) Suggested Reading
1/9 Introduction
  • Why is NLP hard? What's important?
  • Levels of language
  • NLP applications
  • Project description out
    Text classification and lexical semantics
  • Text classification
  • Naive Bayes classifier
  • Logistic Regression
  • Review of linear algebra and calculus
  • Intro to google cloud computing
  • Intro to colab
  • Intro: J&M chapter 1
  • Chomsky hierarchy: J&M 16
  • Prob/Bayes: M&S 2
  • Naive Bayes: J&M 4
  • Logistic Regression: J&M 5
  • 1/16 No lecture (MLK holiday) Assignment 1 release
    Lexical semantics
  • Semantic phenomena and representations
  • WordNet
  • Thesaurus-based semantic similarity
  • Data preparation and ML practice
  • Overview of ML system components
  • Project Milestone 1 Discussion
  • Language models: J&M 3
  • 1/23 Distributional semantics
  • Word-Document Matrix
  • LSA
  • Semantic Similarity
  • Word Vectors
  • N-gram language models
  • How to model language?
  • What's wrong with n-grams?
  • What do language models model?
  • Neural network basics
  • PyTorch Part (1)
  • Smoothing: J&M 3; Rosenfeld (2000)
  • 1/30 Project planning report due
    Smoothing n-grams
  • Add-one or add-λ smoothing
  • Cross-validation
  • Smoothing with backoff
  • Assignment 1 due
    Log-linear models and neural language models
  • Log-linear models
  • Neural network basics (recap)
  • Feedforward neural language Models
  • Deep learning workshop
  • PyTorch Part (2)
  • Neural language models: J&M 7
  • OpenAI blog post GPT-2 (with paper)
  • 2/6 Assignment 2 release
    Assignment 1 answer keys release
    RNN language models
  • Recurrent neural networks (RNNs)
  • Long short-term memory networks (LSTMs)
  • Transformers
  • Long-short term memory networks (LSTMs)
  • The transformer model
  • Review session (Language Models)
  • Project Milestone 2 Discussion
  • Transformer paper ; BERT paper
  • 2/13 Midterm exam
    (12:00-1:50pm in class)

    Return assignment 1 gradings
    Pre-Trained Large Language Models
  • ELMo
  • BERT
  • GPT-(2,3)
  • Intro to Huggingface
  • 2/20 Assignment 2 due
    No lecture (Presidents' Day)
    Project midterm report due
    Syntax
  • Part-of-speech tagging
  • NP Chunking
  • Shallow Parsing
  • Return midterm exam gradings
  • Midterm Solutions Discussion
  • Project Milestone 3 Discussion
  • John Lafferty's paper on CRF
  • 2/27 Assignment 3 release
    Sequence tagging models
  • POS-tagging leftovers
  • Hidden Markov Models (HMMs)
  • The Viterbi Algorithm
  • Sequence tagging models (cont.)
  • The Viterbi Algorithm leftovers
  • Maximum Entropy Markov Models (MEMMs)
  • Review session (Syntax + Seq Tagging)
  • The Viterbi Algorithm: J&M 8
  • Hidden Markov Models: J&M Appendix A;
  • 3/6 Named Entity Recognition
  • MEMM leftovers
  • Intro to NER
  • Nested NERs
  • Probabilistic parsing
  • What is parsing?
  • Why is it useful?
  • Brute-force algorithm
  • CKY algorithms
  • PCFG parsing
  • NLP Application Case Study
  • Attributes: J&M 12
  • Parsing: J&M 13
  • 3/13 Assignment 3 due
    Dependency Parser
  • Dependency grammar
  • Dependency trees
  • Dependency Parser (Cont.)
  • Shift-reduce parser
  • Final exam recitation
  • CCG: Steedman & Baldridge; more
  • TAG/TSG: Van Noord, Guo, Zhang 1/2/3
  • Prob. parsing: J&M 14