ORIE 5741

Global toggle of class tabs

Links for textbooks and Cornell Store open in new tab.

ORIE 5741

Course information provided by the Courses of Study 2023-2024.

Modern data sets, whether collected by scientists, engineers, medical researchers, government, financial firms, social networks, or software companies, are often big, messy, and extremely useful. This course addresses scalable robust methods for learning from big messy data. We'll cover techniques for learning with data that is messy --- consisting of real numbers, integers, booleans, categoricals, ordinals, graphs, text, sets, and more, with missing entries and with outliers --- and that is big --- which means we can only use algorithms whose complexity scales linearly in the size of the data. We will cover techniques for cleaning data, supervised and unsupervised learning, finding similar items, model validation, and feature engineering.

When Offered Fall, Spring.

Prerequisites/Corequisites Prerequisite: MATH 2940, ENGRD 2700, ENGRD 2110/CS 2110, CS 2800 or equivalents.

View Enrollment Information

Syllabi: none
  •   Regular Academic Session.  Choose one lecture and one discussion. Combined with: ORIE 4741

  • 4 Credits GradeNoAud

  •  9800 ORIE 5741   LEC 001

  • Pre-enrollment restricted to ORIE MEng students. Other graduate students, Early Admit MEng students, and OR&E Honors Program students may enroll during Add/Drop.

  •  9801 ORIE 5741   DIS 201

  •  9802 ORIE 5741   DIS 202

  •  9803 ORIE 5741   DIS 203

  •  9804 ORIE 5741   DIS 204