Mihai Surdeanu

mihai surdeanu
mihai AT surdeanu DOT info / msurdeanu AT arizona DOT edu

With clulab.org

Please visit the software page at clulab.org.

Before clulab.org

Multi-instance Multi-label Relation Extraction

This is the code and data from our EMNLP 2012 paper. Additionally, this package includes all the source code from our KBP 2011 paper on slot filling.

Stanford CoreNLP

I contribute to Stanford's CoreNLP suite of natural language analysis tools. This includes tokenization, morphological analysis, POS tagging, named and numeric entity recognition, syntactic parsing (both constituents and dependencies), and coreference resolution. The software allows you to generate all these annotations with just two lines of code: first, create a StanfordCoreNLP object, then, call the annotate(new Annotation(String yourText)) method.

Stanford Biomedical Event Parser

This software is the event parser component from the Stanford and FAUST submissions to the BioNLP shared task.

Ensemble: Linearly-interpolated Dependency Parsers

This code implements a linear interpolation of several linear-time parsing models (all based on MaltParser). Each individual parser runs in its own thread, which means that, if a sufficient number of cores are available, the overall runtime is essentially similar to a single Malt parser. The resulting parser has state of the art performance yet it remains very fast.

SwiRL: The Semantic Role Labeler

SwiRL is a Semantic Role Labeling (SRL) system for English constructed on top of the full syntactic analysis of text. Achieved state-of-the-art performance in the CoNLL 2005 SRL evaluation.

Bios: Suite of Syntactico-Semantic Analyzers

Includes a named-entity recognizer, a syntactic chunker, a POS tagger, and a "smart" tokenizer. All processors are learned using the MiLL machine learning library (see below).

MiLL: Machine Learning Library

Includes SVM, Maximum Entropy and Perceptron classifiers under a unique and simple interface. All algorithms support mult-class problems. MiLL includes the novel Perceptron algorithm with dynamic uneven margins I designed for my ACE Information Extraction system (see the publication page). MiLL is distributed together with BIOS but it can be used independently of BIOS, for any ML task.

Spear: Syntactic Parser

Syntactic parser heavily based on Michael Collins' Model 1 parser. The Spear package includes also a corpus of parsed questions I created from the TREC 8 - 12 evaluations. This corpus was crucial in improving the parser performance on questions.