Dynamic Programming for Linear Time Incremental Parsing

Similar documents
Recurrent neural network grammars. Slide credits: Chris Dyer, Adhiguna Kuncoro

Heuristic search, A* CS171, Winter 2018 Introduction to Artificial Intelligence Prof. Richard Lathrop. Reading: R&N

Modeling: Having Kittens

Grade: 8. Author: Hope Phillips

The online processing of semantic and pragmatic content

Population Dynamics: Predator/Prey Teacher Version

ESTIMATING NEST SUCCESS: WHEN MAYFIELD WINS DOUGLAS H. JOHNSON AND TERRY L. SHAFFER

A Column Generation Algorithm to Solve a Synchronized Log-Truck Scheduling Problem

Algebra 3 SAILS. Pacing Guide to make an A in the course = equivalent to 21 ACT math sub-score: SAILS Pacing for Traditional Schedule Module 1

5 State of the Turtles

Animal Speeds Grades 7 12

Nathan A. Thompson, Ph.D. Adjunct Faculty, University of Cincinnati Vice President, Assessment Systems Corporation

Building Rapid Interventions to reduce antimicrobial resistance and overprescribing of antibiotics (BRIT)

Dog Years Dilemma. Using as much math language and good reasoning as you can, figure out how many human years old Trina's puppy is?

Building Concepts: Mean as Fair Share

Lab 6: Energizer Turtles

Population Dynamics: Predator/Prey Teacher Version

Catapult Project (Quadratic Functions)

[Boston March for Science 2017 photo Hendrik Strobelt]

University of Pennsylvania. From Perception and Reasoning to Grasping

Logical Forms. Prof. Sameer Singh CS 295: STATISTICAL NLP WINTER February 16, 2017

What kind of Theory do we need for English Syntax? Are languages finite? Could we list all the sentences of English?

King Fahd University of Petroleum & Minerals College of Industrial Management

Moving toward formalisation COMP62342

Eating Your Own Dog Food

Veterinary Medical Terminology

Modeling and Control of Trawl Systems

Hours of manual cash counting reduced to 12 minutes. John G. Shedd Aquarium, USA

Our class had 2 incubators full of eggs. On day 21, our chicks began to hatch. In incubator #1, 1/3 of the eggs hatched. There were 2 chicks.

A Novel Approach For Error Detection And Correction Using Prefix-Adders

Design of a High Speed Adder

Component Specification NFQ Level 5. Sheep Husbandry 5N Component Details. Sheep Husbandry. Level 5. Credit Value 10

Heuris'c search, A* CS171, Fall 2016 Introduc'on to Ar'ficial Intelligence Prof. Alexander Ihler. Reading: R&N

Numeracy Practice Tests

Design of High Speed Vedic Multiplier Using Carry Select Adder with Brent Kung Adder

Week 42: Siamese Network: Architecture and Applications in Visual Object Tracking. Yuanwei Wu

Be Doggone Smart at Work

June 2009 (website); September 2009 (Update) consent, informed consent, owner consent, risk, prognosis, communication, documentation, treatment

A Peek Into the World of Streaming


Egg laying vs. Live Birth

Multiclass and Multi-label Classification

Investigating Fish Respiration

Informed search algorithms

Australian Journal of Basic and Applied Sciences. Performance Analysis of Different Types of Adder Using 3-Transistor XOR Gate

Vigilance Behaviour in Barnacle Geese

How To... Why weigh eggs?

Supplementary Fig. 1: Comparison of chase parameters for focal pack (a-f, n=1119) and for 4 dogs from 3 other packs (g-m, n=107).

Objectives. ERTs for the New Beef Industry. Ancient History. The EPD we produce entirely depends on the tools we have to use them.

Comparative Analysis of Adders Parallel-Prefix Adder for Their Area, Delay and Power Consumption

EVENTS OR STEPS The events in the story are the steps that the character takes to solve the problem or reach the goal.

Human Impact on Sea Turtle Nesting Patterns

Multi-Frequency Study of the B3 VLA Sample. I GHz Data

Design of 32 bit Parallel Prefix Adders

Key facts for maximum broiler performance. Changing broiler requires a change of approach

The World's Best Jumper

The Heifer Facility Puzzle: The New Puzzle Pieces

Gulf Oil Spill ESSM 651

SHEEP SIRE REFERENCING SCHEMES - NEW OPPORTUNITIES FOR PEDIGREE BREEDERS AND LAMB PRODUCERS a. G. Simm and N.R. Wray

Introduction Methods to Track Changes in Cobra Cobra Project Preferences Cobra New Feature Change Log Summary

HOW TO... Feather Sex Day-Old Chicks in the Hatchery

Pre-lab Homework Lab 8: Natural Selection

Wool Technology and Sheep Breeding

2017 ANIMAL SHELTER STATISTICS

Nicole Wilde. June 20 & 21, Proudly Presents. Radisson Hotel & Conference Center Ave NW. Edmonton, Alberta

Approximating the position of a hidden agent in a graph

Mexican Gray Wolf Reintroduction

Enter Online:

Loose Leash Walking. Core Rules Applied:

Section: 101 (2pm-3pm) 102 (3pm-4pm)

Chapter 18: Categorical data

Design of Low Power and High Speed Carry Select Adder Using Brent Kung Adder

Rumination Monitoring White Paper

Data Mining Classification: Alternative Techniques. Lecture Notes for Chapter 4. Rule-Based. Introduction to Data Mining, 2 nd Edition

Activities. Life in the Arctic Tundra. Grades: PreK K, 1 2, 3 5, 6 8

Course: Canine Massage and Bodywork Certification Course Part A Cranial Trunk and Thoracic Appendicular System. Movers of the Forelimb, Neck, and Head

Title. Grade level. Time. Student Target. PART 3 Lesson: Populations. PART 3 Activity: Turtles, Turtle Everywhere! minutes

Fraction Approximation: Closer to Zero, One-half or One whole? CCSS: 3.NF.3, 4.NF.2 VA SOLs: 3.3, 4.2, 5.2

Accommodation Process for Comfort Animal in Campus Housing and Responsibilities of the Comfort Animal Owner

Relationship Between Eye Color and Success in Anatomy. Sam Holladay IB Math Studies Mr. Saputo 4/3/15

Tailoring a terminal sire breeding program for the west

The Kaggle Competitions: An Introduction to CAMCOS Fall 2015

Nordic Cattle Genetic Evaluation a tool for practical breeding with red breeds

A Genetic Comparison of Standard and Miniature Poodles based on autosomal markers and DLA class II haplotypes.

Genetic approaches to improving lamb survival

Problems from The Calculus of Friendship:

OIE standards : procedures, model certificates

Active sensing. Ehud Ahissar

Indigo Sapphire Bear. Newfoundland. Indigo Sapphire Bear. January. Dog's name: DR. NEALE FRETWELL. R&D Director

Comparative Evaluation of Online and Paper & Pencil Forms for the Iowa Assessments ITP Research Series

C R H G E K. 1 Solve the puzzle. lion. parrot. crocodile. flamingo. snake. tortoise. horse. zebra. elephant. eagle duck. monkey. Classify the animals.

The EVM + AGILE Anthology

Dunbia 2017 Dunbia 2017

The Dog of Pompeii. historical fiction by Louis Untermeyer

PROGRESS REPORT for COOPERATIVE BOBCAT RESEARCH PROJECT. Period Covered: 1 April 30 June Prepared by

Perplexity of n-gram and dependency language models

Comparison of Parallel Prefix Adders Performance in an FPGA

STATISTICAL REPORT. Preliminary Analysis of the Second Collaborative Study of the Hard Surface Carrier Test

It Is Raining Cats. Margaret Kwok St #: Biology 438

LABORATORY EXERCISE 6: CLADISTICS I

Transcription:

Dynamic Programming for Linear Time ncremental Parsing Liang Huang nformation Sciences nstitute University of Southern California Kenji Sagae nstitute for Creative Technologies University of Southern California ACL 2010, Uppsala, Sweden, July 2010 (slightly expanded)

DP for ncremental Parsing (Huang and Sagae) Ambiguities in Parsing feed cats nearby in the garden... let s focus on dependency structures for simplicity ambiguous attachments of nearby and in ambiguity explodes exponentially with sentence length must design efficient (polynomial) search algorithm typically using dynamic programming (DP); e.g. CKY 2

DP for ncremental Parsing (Huang and Sagae) Ambiguities in Parsing feed cats nearby in the garden... let s focus on dependency structures for simplicity ambiguous attachments of nearby and in ambiguity explodes exponentially with sentence length must design efficient (polynomial) search algorithm typically using dynamic programming (DP); e.g. CKY 2

DP for ncremental Parsing (Huang and Sagae) Ambiguities in Parsing feed cats nearby in the garden... let s focus on dependency structures for simplicity ambiguous attachments of nearby and in ambiguity explodes exponentially with sentence length must design efficient (polynomial) search algorithm typically using dynamic programming (DP); e.g. CKY 2

DP for ncremental Parsing (Huang and Sagae) Ambiguities in Parsing feed cats nearby in the garden... let s focus on dependency structures for simplicity ambiguous attachments of nearby and in ambiguity explodes exponentially with sentence length must design efficient (polynomial) search algorithm typically using dynamic programming (DP); e.g. CKY 2

DP for ncremental Parsing (Huang and Sagae) 3 But full DP is too slow... feed cats nearby in the garden... full DP (like CKY) is too slow (cubic-time) while human parsing is fast & incremental (linear-time)

DP for ncremental Parsing (Huang and Sagae) 3 But full DP is too slow... feed cats nearby in the garden... full DP (like CKY) is too slow (cubic-time) how about incremental parsing then? while human parsing is fast & incremental (linear-time) yes, but only with greedy search (accuracy suffers) explores tiny fraction of trees (even w/ beam search)

But full DP is too slow... feed cats nearby in the garden... full DP (like CKY) is too slow (cubic-time) how about incremental parsing then? while human parsing is fast & incremental (linear-time) yes, but only with greedy search (accuracy suffers) explores tiny fraction of trees (even w/ beam search) can we combine the merits of both approaches? a fast, incremental parser with dynamic programming? explores exponentially many trees in linear-time? DP for ncremental Parsing (Huang and Sagae) 3

DP for ncremental Parsing (Huang and Sagae) 4 Linear-Time ncremental DP greedy search principled search incremental parsing (e.g. shift-reduce) (Nivre 04; Collins/Roark 04;...) this work: fast shift-reduce parsing with dynamic programming fast (linear-time) full DP (e.g. CKY) (Eisner 96; Collins 99;...) slow (cubic-time)

Preview of the Results very fast linear-time dynamic programming parser best reported dependency accuracy on PTB/CTB explores exponentially many trees (and outputs forest) parsing time (secs) 1.4 1.2 1 0.8 0.6 0.4 0.2 0 0 10 20 30 40 50 60 70 sentence length DP for ncremental Parsing (Huang and Sagae) 5

Preview of the Results very fast linear-time dynamic programming parser best reported dependency accuracy on PTB/CTB explores exponentially many trees (and outputs forest) parsing time (secs) 1.4 1.2 1 0.8 0.6 0.4 0.2 0 Charniak Berkeley MST this work 0 10 20 30 40 50 60 70 sentence length DP for ncremental Parsing (Huang and Sagae) 5

Preview of the Results very fast linear-time dynamic programming parser best reported dependency accuracy on PTB/CTB explores exponentially many trees (and outputs forest) parsing time (secs) 1.4 1.2 1 0.8 0.6 0.4 0.2 0 Charniak number of trees explored Berkeley MST this work 0 10 20 30 40 50 60 70 sentence length DP for ncremental Parsing (Huang and Sagae) 10 10 10 8 10 6 10 4 10 2 10 0 DP: exponential non-dp beam search 0 10 20 30 40 50 60 70 sentence length 5

DP for ncremental Parsing (Huang and Sagae) 6 Outline Motivation ncremental (Shift-Reduce) Parsing Dynamic Programming for ncremental Parsing Experiments

Shift-Reduce Parsing feed cats nearby in the garden. action stack queue 0 - feed cats... feed cats nearby... cats nearby in... cats nearby in... nearby in the... nearby in the... in the garden... DP for ncremental Parsing (Huang and Sagae) 7

Shift-Reduce Parsing feed cats nearby in the garden. action stack queue 0-1 shift feed cats... feed cats nearby... cats nearby in... cats nearby in... nearby in the... nearby in the... in the garden... DP for ncremental Parsing (Huang and Sagae) 8

Shift-Reduce Parsing feed cats nearby in the garden. action stack queue 0-1 shift 2 shift feed feed cats... feed cats nearby... cats nearby in... cats nearby in... nearby in the... nearby in the... in the garden... DP for ncremental Parsing (Huang and Sagae) 9

Shift-Reduce Parsing feed cats nearby in the garden. action stack queue 0-1 shift 2 shift 3 l-reduce feed feed feed cats... feed cats nearby... cats nearby in... cats nearby in... nearby in the... nearby in the... in the garden... DP for ncremental Parsing (Huang and Sagae) 10

Shift-Reduce Parsing feed cats nearby in the garden. action stack queue 0-1 shift 2 shift 3 l-reduce 4 shift feed feed feed cats feed cats... feed cats nearby... cats nearby in... cats nearby in... nearby in the... nearby in the... in the garden... DP for ncremental Parsing (Huang and Sagae) 11

Shift-Reduce Parsing feed cats nearby in the garden. action stack queue 0-1 shift 2 shift 3 l-reduce 4 shift 5a r-reduce feed feed feed cats feed cats feed cats... feed cats nearby... cats nearby in... cats nearby in... nearby in the... nearby in the... in the garden... DP for ncremental Parsing (Huang and Sagae) 12

Shift-Reduce Parsing feed cats nearby in the garden. action stack queue 0-1 shift 2 shift 3 l-reduce 4 shift 5a r-reduce 5b shift feed feed feed cats feed cats feed cats nearby feed cats... feed cats nearby... cats nearby in... cats nearby in... nearby in the... nearby in the... in the garden... DP for ncremental Parsing (Huang and Sagae) 13

DP for ncremental Parsing (Huang and Sagae) 14 Shift-Reduce Parsing feed cats nearby in the garden. action stack queue 0-1 shift 2 shift 3 l-reduce 4 shift 5a r-reduce 5b shift feed feed feed cats feed cats feed cats nearby shift-reduce conflict feed cats... feed cats nearby... cats nearby in... cats nearby in... nearby in the... nearby in the... in the garden...

DP for ncremental Parsing (Huang and Sagae) 15 Choosing Parser Actions stack queue... feed cats in the garden... nearby stack queue... s2 s1 s0 q0 q1... features: (s0.w, s0.rc, q0,...) = (cats, nearby, in,...) score each action using features f and weights w features are drawn from a local window abstraction (or signature) of a state -- this inspires DP! weights trained by structured perceptron (Collins 02)

DP for ncremental Parsing (Huang and Sagae) 16 Greedy Search each state => three new states (shift, l-reduce, r-reduce) search space should be exponential greedy search: always pick the best next state

DP for ncremental Parsing (Huang and Sagae) 17 Greedy Search each state => three new states (shift, l-reduce, r-reduce) search space should be exponential greedy search: always pick the best next state

DP for ncremental Parsing (Huang and Sagae) 18 Beam Search each state => three new states (shift, l-reduce, r-reduce) search space should be exponential beam search: always keep top-b states

DP for ncremental Parsing (Huang and Sagae) 19 Dynamic Programming each state => three new states (shift, l-reduce, r-reduce) key idea of DP: share common subproblems merge equivalent states => polynomial space

Dynamic Programming each state => three new states (shift, l-reduce, r-reduce) key idea of DP: share common subproblems merge equivalent states => polynomial space graph-structured stack (Tomita, 1988) DP for ncremental Parsing (Huang and Sagae) 20

Dynamic Programming each state => three new states (shift, l-reduce, r-reduce) key idea of DP: share common subproblems merge equivalent states => polynomial space graph-structured stack (Tomita, 1988) DP for ncremental Parsing (Huang and Sagae) 21

Dynamic Programming each state => three new states (shift, l-reduce, r-reduce) key idea of DP: share common subproblems merge equivalent states => polynomial space each DP state corresponds to exponentially many non-dp states graph-structured stack (Tomita, 1988) DP for ncremental Parsing (Huang and Sagae) 21

Dynamic Programming each state => three new states (shift, l-reduce, r-reduce) key idea of DP: share common subproblems merge equivalent states => polynomial space each DP state corresponds to exponentially many non-dp states 10 10 10 8 10 6 10 4 DP: exponential graph-structured stack (Tomita, 1988) DP for ncremental Parsing (Huang and Sagae) 10 2 10 0 non-dp beam search 0 10 20 30 40 50 60 70 sentence length 22

DP for ncremental Parsing (Huang and Sagae) 23 Merging Equivalent States two states are equivalent if they agree on features because same features guarantee same cost stack queue shift-reduce conflict: feed cats nearby in the garden sh... feed re feed sh... cats... s2 s1 s0 q0 q1... feed cats nearby in the garden

DP for ncremental Parsing (Huang and Sagae) 23 Merging Equivalent States two states are equivalent if they agree on features because same features guarantee same cost stack queue shift-reduce conflict: feed cats nearby in the garden sh... feed re feed sh... cats... s2 s1 s0 q0 q1... assume features only look at root of s0 feed cats nearby in the garden

DP for ncremental Parsing (Huang and Sagae) 23 Merging Equivalent States two states are equivalent if they agree on features because same features guarantee same cost stack queue shift-reduce conflict: feed cats nearby in the garden sh... feed re feed... cats feed cats nearby in the garden sh... s2 s1 s0 q0 q1... assume features only look at root of s0 two states are equivalent if they agree on root of s0

DP for ncremental Parsing (Huang and Sagae) 23 Merging Equivalent States two states are equivalent if they agree on features because same features guarantee same cost stack queue shift-reduce conflict: feed cats nearby in the garden sh... feed re feed... cats feed cats nearby in the garden sh... s2 s1 s0 q0 q1... assume features only look at root of s0 two states are equivalent if they agree on root of s0

DP for ncremental Parsing (Huang and Sagae) 24 Merging Equivalent States two states are equivalent if they agree on features because same features guarantee same cost shift-reduce conflict: feed cats nearby in the garden feed cats nearby in the garden cats...... cats sh re... nearby... feed stack queue... s2 s1 s0 q0 q1...

DP for ncremental Parsing (Huang and Sagae) 24 Merging Equivalent States two states are equivalent if they agree on features because same features guarantee same cost shift-reduce conflict: feed cats nearby in the garden feed cats nearby in the garden cats...... cats sh re... nearby... feed stack queue... s2 s1 s0 q0 q1...

DP for ncremental Parsing (Huang and Sagae) 25 Merging Equivalent States two states are equivalent if they agree on features because same features guarantee same cost shift-reduce conflict: feed cats nearby in the garden nearby feed cats nearby in the garden cats...... cats sh re... nearby... feed stack queue... s2 s1 s0 q0 q1... sh re... cats... nearby

DP for ncremental Parsing (Huang and Sagae) 26 Merging Equivalent States two states are equivalent if they agree on features because same features guarantee same cost shift-reduce conflict: feed in the garden feed cats nearby... cats nearby... cats sh re... nearby... feed in the garden stack queue... s2 s1 s0 q0 q1... sh re... cats re... feed... nearby re... feed

DP for ncremental Parsing (Huang and Sagae) 26 Merging Equivalent States two states are equivalent if they agree on features because same features guarantee same cost shift-reduce conflict: feed in the garden feed cats nearby... cats nearby... cats sh re... nearby... feed in the garden stack queue... s2 s1 s0 q0 q1... sh re... cats re... feed... nearby re... feed

DP for ncremental Parsing (Huang and Sagae) 27 Merging Equivalent States two states are equivalent if they agree on features because same features guarantee same cost shift-reduce conflict: feed in the garden feed cats cats nearby nearby...... cats in the garden stack queue... s2 s1 s0 q0 q1... sh re... nearby... cats re re sh... feed... nearby re... feed

DP for ncremental Parsing (Huang and Sagae) 27 Merging Equivalent States two states are equivalent if they agree on features because same features guarantee same cost shift-reduce conflict: feed in the garden feed cats cats nearby nearby...... cats in the garden stack queue... s2 s1 s0 q0 q1... sh re... nearby... cats re re sh... feed... nearby re (local) ambiguity-packing!... feed

DP for ncremental Parsing (Huang and Sagae) 28 Merging Equivalent States two states are equivalent if they agree on features because same features guarantee same cost shift-reduce conflict: feed in the garden feed cats cats nearby nearby...... cats sh re in the garden stack queue... s2 s1 s0 q0 q1... re... nearby... cats re... feed sh... feed... nearby re sh... in

Merging Equivalent States two states are equivalent if they agree on features because same features guarantee same cost shift-reduce conflict: feed in the garden feed cats cats nearby nearby...... cats sh re in the garden stack queue... s2 s1 s0 q0 q1... re... nearby... cats re... feed sh... feed... nearby re sh... in DP for ncremental Parsing (Huang and Sagae) graph-structured stack 28

Theory: Polynomial-Time DP stack queue... s2 s1 s0 q0 q1... this DP is exact and polynomial-time if features are: a) bounded -- for polynomial time features can only look at a local window b) monotonic -- for correctness (optimal substructure) features should draw no more info from trees farther away from stack top than from trees closer to top both are intuitive: a) always true; b) almost always true DP for ncremental Parsing (Huang and Sagae) 29

DP for ncremental Parsing (Huang and Sagae) 30 Theory: Monotonic History related: grammar refinement by annotation (Johnson, 1998) annotate vertical context history (e.g., parent) monotonicity: can t annotate grand-parent without annotating the parent (otherwise DP would fail) our features: left-context history instead of vertical-context similarly, can t annotate s2 without annotating s1 but we can always design minimum monotonic superset grand-parent parent s2 s1 s0 stack

DP for ncremental Parsing (Huang and Sagae) 31 Related Work Graph-Structured Stack (Tomita 88): Generalized LR GSS is just a chart viewed from left to right (e.g. Earley 70) this line of work started w/ Lang (1974); stuck since 1990 b/c explicit LR table is impossible with modern grammars general idea: compile CFG parse chart to FSAs (e.g. our beam)

Related Work Graph-Structured Stack (Tomita 88): Generalized LR GSS is just a chart viewed from left to right (e.g. Earley 70) this line of work started w/ Lang (1974); stuck since 1990 b/c explicit LR table is impossible with modern grammars general idea: compile CFG parse chart to FSAs (e.g. our beam) We revived and advanced this line of work in two aspects theoretical: implicit LR table based on features merge and split on-the-fly; no pre-compilation needed monotonic feature functions guarantee correctness (new) practical: achieved linear-time performance with pruning 31 DP for ncremental Parsing (Huang and Sagae)

Experiments

DP for ncremental Parsing (Huang and Sagae) 33 Speed Comparison 5 times faster with the same parsing accuracy DP non-dp time (hours)

DP for ncremental Parsing (Huang and Sagae) 34 Correlation of Search and Parsing better search quality <=> better parsing accuracy dependency accuracy 93.1 93 92.9 92.8 92.7 92.6 92.5 92.4 92.3 92.2 DP non-dp 2365 2370 2375 2380 2385 2390 2395 average model score

Search Space: Exponential number of trees explored 10 10 10 8 10 6 10 4 10 2 10 0 DP for ncremental Parsing (Huang and Sagae) DP: exponential non-dp: fixed (beam-width) 0 10 20 30 40 50 60 70 sentence length 35

DP for ncremental Parsing (Huang and Sagae) 36 N-Best / Forest Oracles DP forest oracle (98.15) DP k-best in forest non-dp k-best in beam

DP for ncremental Parsing (Huang and Sagae) 37 Better Search => Better Learning DP leads to faster and better learning w/ perceptron

DP for ncremental Parsing (Huang and Sagae) 38 Learning Details: Early Updates greedy search: update at first error (Collins/Roark 04) beam search: update when gold is pruned (Zhang/Clark 08) DP search: also update when gold is merged (new!) b/c we know gold can t make to the top again

Parsing Time vs. Sentence Length parsing speed (scatter plot) compared to other parsers parsing time (secs) 1.4 1.2 1 0.8 0.6 0.4 0.2 0 0 10 20 30 40 50 60 70 sentence length DP for ncremental Parsing (Huang and Sagae) 39

Parsing Time vs. Sentence Length parsing speed (scatter plot) compared to other parsers parsing time (secs) 1.4 1.2 1 0.8 0.6 0.4 0.2 0 Charniak DP for ncremental Parsing (Huang and Sagae) Berkeley this work MST 0 10 20 30 40 50 60 70 sentence length 39

Parsing Time vs. Sentence Length parsing speed (scatter plot) compared to other parsers parsing time (secs) 1.4 1.2 1 0.8 0.6 0.4 0.2 0 Charniak O(n 2.5 ) O(n 2.4 ) Berkeley this work DP for ncremental Parsing (Huang and Sagae) 39 MST 0 10 20 30 40 50 60 70 sentence length O(n 2 ) O(n)

DP for ncremental Parsing (Huang and Sagae) Final Results much faster than major parsers (even with Python!) first linear-time incremental dynamic programming parser best reported dependency accuracy on Penn Treebank McDonald et al 05 - MST Koo et al 08 baseline* Zhang & Clark 08 single this work Charniak 00 Petrov & Klein 07 90.2 92.0 91.4 92.1 92.5 92.4 time complexity trees searched 0.12 O(n 2 ) exponential - O(n 4 ) exponential 0.11 O(n) constant 0.04 O(n) exponential 0.49 O(n 2.5 ) exponential 0.21 O(n 2.4 ) exponential 89 91 93

DP for ncremental Parsing (Huang and Sagae) Final Results much faster than major parsers (even with Python!) first linear-time incremental dynamic programming parser best reported dependency accuracy on Penn Treebank McDonald et al 05 - MST Koo et al 08 baseline* Zhang & Clark 08 single this work Charniak 00 Petrov & Klein 07 90.2 92.0 91.4 92.1 92.5 92.4 time complexity trees searched 0.12 O(n 2 ) exponential - O(n 4 ) exponential 0.11 O(n) constant 0.04 O(n) exponential 0.49 O(n 2.5 ) exponential 0.21 O(n 2.4 ) exponential 89 91 93

Final Results much faster than major parsers (even with Python!) first linear-time incremental dynamic programming parser best reported dependency accuracy on Penn Treebank McDonald et al 05 - MST Koo et al 08 baseline* Zhang & Clark 08 single this work Charniak 00 Petrov & Klein 07 90.2 92.0 91.4 92.1 92.5 92.4 time complexity trees searched 0.12 O(n 2 ) exponential - O(n 4 ) exponential 0.11 O(n) constant 0.04 O(n) exponential 0.49 O(n 2.5 ) exponential 0.21 O(n 2.4 ) exponential 89 91 93 DP for ncremental Parsing (Huang and Sagae) *at this ACL: Koo & Collins 10: 93.0 with O(n 4 )

Final Results on Chinese also the best parsing accuracy on Chinese Penn Chinese Treebank (CTB 5) all numbers below use gold-standard POS tags Duan et al. 2007 Zhang & Clark 08 (single) this work 73.7 76.7 78.3 83.9 word 84.4 non-root root 84.3 84.7 85.2 85.5 DP for ncremental Parsing (Huang and Sagae) 70 85 41

Conclusion greedy search incremental parsing (e.g. shift-reduce) principled search fast (linear-time) full dynamic programming (e.g. CKY) slow (cubic-time) DP for ncremental Parsing (Huang and Sagae) 42

DP for ncremental Parsing (Huang and Sagae) 42 Conclusion greedy search principled search incremental parsing (e.g. shift-reduce) linear-time shift-reduce parsing w/ dynamic programming fast (linear-time) full dynamic programming (e.g. CKY) slow (cubic-time)

DP for ncremental Parsing (Huang and Sagae) Thank You a general theory of DP for shift-reduce parsing fast, accurate DP parser release coming soon: as long as features are bounded and monotonic http://www.isi.edu/~lhuang http://www.ict.usc.edu/~sagae future work adapt to constituency parsing (straightforward) other grammar formalisms like CCG and TAG integrate POS tagging into the parser 43