Most of the web data today consists of unstructured text. Of course, the fact that this data
exists is irrelevant, unless it is made available such that users can quickly find information
that is relevant for their needs. This course will cover the fundamental knowledge necessary
to build these systems, such as web crawling, index construction and compression, Boolean,
vector-based, and probabilistic retrieval models, text classification and clustering, link
analysis algorithms such as PageRank, and computational advertising. The students will also
complete one programming project, in which they will construct one complex application that
combines multiple algorithms into a system that solves real-world problems.
Time and Place
Monday/Wednesday 5:00pm - 6:15pm in Gould-Simpson, Room 906
Instructor Information
Instructor: Mihai Surdeanu
msurdeanu AT email DOT arizona DOT edu
Office: Gould-Simpson 746
Office Hours: by request
TA: Enrique Noriega
enoriega AT email DOT arizona DOT edu
Office: Gould-Simpson 934
Office Hours: Wednesday, 2 - 3pm
|