Courses of Study 2023-2024 
    
    Nov 25, 2024  
Courses of Study 2023-2024 [ARCHIVED CATALOG]

Add to Favorites (opens a new window)

CS 5740 - Natural Language Processing


     


Fall, Spring. Ithaca: 4 credits; Cornell Tech: 3 credits. Fall: student option grading (no audit); Spring: letter grades only (no audit).

Fall (Ithaca):

  • Strong programming skills are important. Three semesters of programming classes are strongly recommended (e.g., completion of CS3110). CS2110 may suffice if you individually could have successfully and easily completed the assignments by yourself.
  • Python experience.
  • Pytorch experience (as through CS4780) not required but some students report it being very helpful.
  • Comfort with elementary probability.
  • Clear understanding of matrix and vector operations.
  • Familiarity with differentiation.

Spring (New York City): -

  • CS 4780 /5780 or CS 5785 or CS 5781 or equivalent machine learning course experience.
  • Strong experience with Python.
  • Familiarity with a numerical library (e.g., numpy)
  • Experience with a neural network framework (e.g., PyTorch, TensorFlow).
  • Strong understanding of foundational CS concepts such as memory requirements and computational complexity.  
  • Students need to be comfortable with calculus and probability, primarily differentiation and basic discrete distributions.
Fall: Ithaca; Spring: New York City. Co-meets with COGST 4740 /CS 4740 /LING 4474  (Fall only).

Fall: L Lee; Spring: Staff.

This course constitutes an introduction to natural language processing (NLP), the goal of which is to enable computers to use human languages as input, output, or both. NLP is at the heart of many of today’s most exciting technological achievements, including machine translation, automatic conversational assistants and Internet search. The course will introduce core problems and methodologies in NLP, including machine learning, problem design, and evaluation methods. 

Ithaca only: Expect each of the roughly four connected programming assignments to take tens of hours, although this time is distributed over multiple weeks; to require writing code to massage raw-ish data into different formats and other accessory functions as well as to implement core algorithms; and to necessitate much independent examination of documentation.



Add to Favorites (opens a new window)