Entailment above the word level in distributional semantics

Similar documents
Semantics. These slides were produced by Hadas Kotek.

Perplexity of n-gram and dependency language models

IMAGE CAPTIONING USING PHRASE-BASED HIERARCHICAL LSTM MODEL


Chapter 6: Extending Theory

Day 1 Day 2 Day 3 Day 4 Day 5. nouns and adjectives. Nouns. Nouns Adjectives. Verbs (progressive)

Determiners and generalized quantifiers

Recurrent neural network grammars. Slide credits: Chris Dyer, Adhiguna Kuncoro

Grade 5, Prompt for Opinion Writing Common Core Standard W.CCR.1

Logical Forms. Prof. Sameer Singh CS 295: STATISTICAL NLP WINTER February 16, 2017

Machine Learning.! A completely different way to have an. agent acquire the appropriate abilities to solve a particular goal is via machine learning.

The Kaggle Competitions: An Introduction to CAMCOS Fall 2015

Chapter 6: Extending Theory

An Introduction to Formal Logic

Subdomain Entry Vocabulary Modules Evaluation

What kind of Theory do we need for English Syntax? Are languages finite? Could we list all the sentences of English?

BOARD OF SUPERVISORS BUSINESS MEETING ACTION ITEM

Grade 3, Prompt for Opinion Writing

Noise reduction and normalization of microblogging messages

Semantically-driven Automatic Creation of Training Sets for Object Recognition

Multiclass and Multi-label Classification

Wolves By Gail Gibbons. Recommended Reading for grades 3-5

Superlative Quantifiers as Meta Speech Acts

Teacher Edition. Lizard s Tail. alphakids. Written by Mark Gagiero Illustrated by Kelvin Hucker

The General Assembly of the Commonwealth of Pennsylvania hereby enacts as follows:

The online processing of semantic and pragmatic content

The weekly passage discussed issues related to dog ownership. Here is some information that might be helpful to students less familiar the topic.

Let s Talk Turkey Selection Let s Talk Turkey Expository Thinking Guide Color-Coded Expository Thinking Guide and Summary

The Well Bred Sentence Chapter 10: The Apostrophe

THE GENERAL ASSEMBLY OF PENNSYLVANIA SENATE BILL

Sentences and pictures: not just more words and pictures

King Fahd University of Petroleum & Minerals College of Industrial Management

Question Bank. Class 4. Q2: What changes do you see in Tom s personality after his and Edward s lives were exchanged?

Project Duration Forecasting

Where do models come from and where do they go?

Use of monthly collected milk yields for the early detection of vector-borne emerging diseases.

LEARNING OBJECTIVES. Watch and understand a video about a wildlife organization. Watch and listen

Moving toward formalisation COMP62342

THE GENERAL ASSEMBLY OF PENNSYLVANIA SENATE BILL

I. Instructional Support

278 Metaphysics. Tibbles, the Cat. Chapter 34

DICOM Correction Proposal

ST NICHOLAS COLLEGE HALF YEARLY PRIMARY EXAMINATIONS. February YEAR 4 ENGLISH TIME: 1hr 15 min (Reading Comprehension, Language, and Writing)

A Comparison of Machine Translation Paradigms for Use in Black-Box Fuzzy-Match Repair

The Year of the Dog. Why Were Different Animals Chosen? What Is Good about the Year of the Dog? What Jobs Would People Be Good At?

Dynamic Programming for Linear Time Incremental Parsing

JUDGING RABBITS 4-H LEADER MANUAL EM4502E WHY JUDGE? HOW TO JUDGE

Study Skills-Paragraph & Essay Structure

SIGNAL WORDS CAUSE/EFFECT COMPARE/CONTRAST DESCRIPTION

Boosting Biomedical Entity Extraction by Using Syntactic Patterns for Semantic Relation Discovery

VISUALIZING TEXT. Petra Isenberg

The Kiwi. lesson 1. 2 Unit 1: Animals. Before You Read. Look at the picture. Read the sentences. Check ( ) True, False, or Don t Know.

Livestock and Horse Self- Evacuation Information & Form Kit

CUYAHOGA COUNTY DISTRICT BOARD OF HEALTH RABIES CONTROL REGULATION

Multilevel Script. Teacher s Guide. Animals, Animals. Level E Level H Level K. Levels: E, H, and K Word Count: 460. Story Summary: Cast of Characters:

Reasoning with Neural Networks

MEMORANDUM JOHN ROGERS, RECREATION SERVICES DIRECTOR HEATHER WHITHAM, CITY ATTORNEY DAVID HIRSCH, ASSISTANT CITY ATTORNEY

Sheep Breeding in Norway

THE CORPORATION OF THE CITY OF ENDERBY BYLAW NO. 1469

CSE 408 Multimedia Info Sys.

THE GENERAL ASSEMBLY OF PENNSYLVANIA SENATE BILL INTRODUCED BY SCHWANK, COSTA, BLAKE, BREWSTER AND VULAKOVICH, JUNE 2, 2017

visiting with different handlers, 5. Hospitals, nursing homes or other facilities

When dropping off or picking up your pet please either keep them on a leash or crated.

VISUALIZING TEXT. Petra Isenberg

Perioperative surgical risks and outcomes of early-age gonadectomy in cats and dogs at People for Animals, Inc.

From James Merrill's Dogs to the. Alfred A. Knopf Borzoi Devices. Jack W. C. Hagstrom

Genre Expository Thinking Guide and Activities

2017 Veterinary Enrichment Camp Application

Grade 5 English Language Arts

Application of EVM to Contracts

Notable Veterinarians of 2014, Part 2

New York State Fair 2017 Youth Rabbit and Cavy Events EXHIBITOR INFORMATION

TITLE VII ANIMAL AND RABIES CONTROL. Chapter 7.1. Definitions Animal. Means any animal other than dogs which may be affected by rabies.

Teaching Activities. for

Data Mining Classification: Alternative Techniques. Lecture Notes for Chapter 4. Rule-Based. Introduction to Data Mining, 2 nd Edition

CS6501: Deep Learning for Visual Recognition. CNN Architectures

Moving towards formalisation COMP62342

GUIDELINES. Ordering, Performing and Interpreting Laboratory Tests in Veterinary Clinical Practice

SCIENTIFIC COMMITTEE FIFTH REGULAR SESSION August 2009 Port Vila, Vanuatu

Assessing genetic gain, inbreeding, and bias attributable to different flock genetic means in alternative sheep sire referencing schemes

Lab Report These are the questions you are asked to answer as you go through the lab. Your lab notebook begins after the questions (page 3).

HMA-V Action plan on antimicrobial issues Version for publication (27 January 2011)

ESL Writing & Computerized Accuplacer ESL (Reading, Listening, Language Use)

The Year of the Dog. thank them for their loyalty, the Buddha gave each one of these animals their own year in the Chinese zodiac cycle.

Constructing effective paragraphs LEARNING SKILLS

On Deriving Aspectual Sense

4-H Explorer Cavy Project Record Book

Inferring Ancestor-Descendant Relationships in the Fossil Record

Read Brown Bear, Brown Bear, What Do You See? Read the book and talk about all the animals!

On Some Counterexamples to the Transitivity of Grounding

CAT 2013 PAPER mitfive.org CAT 2013 PAPER. page 1 / 5

CITY of ALBUQUERQUE SEVENTEENTH COUNCIL

The integration of dogs into collaborative humanrobot. - An applied ethological approach - PhD Thesis. Linda Gerencsér Supervisor: Ádám Miklósi

[Boston March for Science 2017 photo Hendrik Strobelt]

Breaking News English.com Ready-to-Use English Lessons by Sean Banville

(1) As used in this rule, a brucella canis test means one of the following: (b)(a) An indirect fluorescent antibody test (IFA test);

SJK(C) PU SZE YEAR 3 ENGLISH LANGUAGE ASSESSMENT (4) PAPER 1

Battersea response to the Public Audit and Post-legislative Scrutiny Committee s call for evidence on the Control of Dogs (Scotland) Act 2010

4-1. Reshicted Quantifiers 41

What is a dinosaur? Reading Practice

Transcription:

Entailment above the word level in distributional semantics Marco Baroni Raffaella Bernardi Ngoc-Quynh Do Chung-chieh Shan University of Trento University of Trento EM LCT, Free University of Bozen-Bolzano Cornell University, University of Tsukuba EACL 25 April 2012

Summary Entailment among composite phrases rather than nouns. (Cheap training data!) Entailment among logical words rather than content words. (Part of Recognizing Textual Entailment?) Different entailment relations at different semantic types. (Prediction from formal semantics.) 2/17

Summary Entailment among composite phrases rather than nouns. (Cheap training data!) Entailment among logical words rather than content words. (Part of Recognizing Textual Entailment?) Different entailment relations at different semantic types. (Prediction from formal semantics.) AN == N big cat cat train test N == N dog animal 2/17

Summary Entailment among composite phrases rather than nouns. (Cheap training data!) Entailment among logical words rather than content words. (Part of Recognizing Textual Entailment?) Different entailment relations at different semantic types. (Prediction from formal semantics.) AN == N big cat cat N == N dog animal QN == train QN many dogs some dogs test QN == QN all cats several cats 2/17

Summary Entailment among composite phrases rather than nouns. (Cheap training data!) Entailment among logical words rather than content words. (Part of Recognizing Textual Entailment?) Different entailment relations at different semantic types. (Prediction from formal semantics.) AN == N big cat cat QN == QN many dogs some dogs train test N == N dog animal QN == QN all cats several cats 2/17

Approaches to semantics In order to say what a meaning is, we may first ask what a meaning does, and then find something that does that. David Lewis 3/17

Approaches to semantics In order to say what a meaning is, we may first ask what a meaning does, and then find something that does that. David Lewis Truth, entailment Every person cried. Every professor cried. A person cried. A professor cried. Formal semantics x. Px Cx λg. x. Px gx C λf. λg. x. fx gx P 3/17

Approaches to semantics In order to say what a meaning is, we may first ask what a meaning does, and then find something that does that. David Lewis Concepts, similarity ambulance battleship ambulance bookstore Distributional semantics... ambulance 27 10 50 17 130... battleship 35 0 32 1 25... bookstore 5 0 6 33 13............ abandon abdominal ability academic accept 3/17

4/17

Distributional semantics for entailment among words For each word w, rank contexts c by descending Pr(c w) Pr(c) > 1. pointwise mutual information 5/17

Distributional semantics for entailment among words For each word w, rank contexts c by descending Pr(c w) Pr(c) > 1. pointwise mutual information parent person professor argcount n arglist n arglist j phane n specity n qdisc n carthy n parents-to-be n non-resident j step-parent n tc n ballons n eliza n symptons n adoptive j stepparent n nonresident j home-school n scabrid n petiolule n... anglia n first-mentioned j unascertained j enure v deposit-taking j bonis n iconclass j cotswolds n aforesaid n haver v foresaid j gha n sub-paragraphs n enacted j geest j non-medicinal j sub-paragraph n intimation n arrestment n incumbrance n... william n extraordinarius n ordinarius n francis n reid n emeritus n emeritus j derwent n regius n laurence n edward n carisoprodol n adjunct j winston n privatdozent j edward j xanax n tenure v cialis n florence n... 5/17

Distributional semantics for entailment among words Context overlap with word 2 3000 2000 1000 parent-person professor-person person-parent professor-parent person-professor parent-professor 0 0 1000 2000 3000 4000 5000 Context rank of word 1 6/17

Distributional semantics for entailment among words Context overlap with word 2 3000 2000 1000 perfect parent-person professor-person person-parent professor-parent person-professor parent-professor 0 0 1000 2000 3000 4000 5000 Context rank of word 1 6/17

Distributional semantics for entailment among words Context overlap with word 2 3000 2000 1000 perfect parent-person professor-person person-parent professor-parent person-professor parent-professor 0 0 1000 2000 3000 4000 5000 Context rank of word 1 Better: skew divergence (Lee), balapinc (Kotlerman et al.),... 6/17

Above the word level Phrases have corpus distributions too! N AN QN cat white cat every cat 7/17

Above the word level Phrases have corpus distributions too! But N AN QN Syntactic category N cat N AN white cat N QN every cat QP 7/17

Above the word level Phrases have corpus distributions too! But N AN QN Syntactic category Semantic type N cat N e t AN white cat N e t QN every cat QP (e t) t 7/17

Above the word level Phrases have corpus distributions too! But N AN QN * Syntactic category Semantic type N cat N e t AN white cat N e t AAN big white cat N e t QN every cat QP (e t) t QAN every big cat QP (e t) t AQN big every cat QQN some every cat 7/17

Our questions Entailment among composite phrases rather than nouns? Entailment among logical words rather than content words? Different entailment relations at different semantic types? AN == N big cat cat train test N == N dog animal QN == QN many dogs some dogs QN == QN all cats several cats 8/17

Our questions Entailment among composite phrases rather than nouns? Entailment among logical words rather than content words? Different entailment relations at different semantic types? AN == N big cat cat QN == train QN many dogs some dogs test N == N dog animal QN == QN all cats several cats 8/17

Our questions Entailment among composite phrases rather than nouns? Entailment among logical words rather than content words? Different entailment relations at different semantic types? AN == N N big cat cat QN == QN QN many dogs some dogs train test N == N N dog animal QN == QN QN all cats several cats 8/17

Our semantic space BNC, WackyPedia, ukwac lemmatized, POS-tagged tokens (2.8G) AN QN A Q N (48K) most frequent A, N, V (27K) #(c, w) TreeTagger (Schmid) words and phrases in the same sentence 9/17

Our semantic space BNC, WackyPedia, ukwac lemmatized, POS-tagged tokens (2.8G) AN QN A Q N (48K) most frequent A, N, V (27K) #(c, w) TreeTagger (Schmid) words and phrases in the same sentence PMI log Pr(c w) Pr(c) SVD (300) U Σ 9/17

Our semantic space BNC, WackyPedia, ukwac lemmatized, POS-tagged tokens (2.8G) AN QN A Q N (48K) most frequent A, N, V (27K) #(c, w) TreeTagger (Schmid) words and phrases in the same sentence PMI log Pr(c w) Pr(c) SVD (300) U Σ frequency baseline cosine baseline balapinc SVM 9/17

Our entailment classifiers PMI log Pr(c w) Pr(c) 10/17

Our entailment classifiers PMI log Pr(c w) Pr(c) 10/17

Our entailment classifiers PMI log Pr(c w) Pr(c)? 10/17

Our entailment classifiers PMI log Pr(c w) Pr(c)? balapinc (Kotlerman et al.) 10/17

Our entailment classifiers PMI log Pr(c w) Pr(c)? 0 balapinc 1 > threshold? 10/17

Our entailment classifiers PMI log Pr(c w) Pr(c) Train AN N QN QN AN N Test N N QN QN QN QN? 0 balapinc 1 > threshold? 10/17

Our entailment classifiers PMI log Pr(c w) Pr(c) SVD U Σ? 0 balapinc 1 > threshold? SVM (cubic) outperformed naïve Bayes, k NN 10/17

Our data sets WordNet pope spiritual_leader spiritual_leader leader cat feline feline carnivore. 11/17

Our data sets WordNet pope leader cat carnivore. (1385) 11/17

Our data sets WordNet invert pope leader cat carnivore. (1385) leader pope cat leader. (1385) resample 11/17

Our data sets most frequent WordNet big former. (300) invert pope leader cat carnivore. (1385) leader pope cat leader. (1385) resample 11/17

Our data sets most frequent WordNet big former. (256) invert pope leader cat carnivore. (1385) leader pope cat leader. (1385) resample 11/17

Our data sets most frequent BLESS WordNet resample big former. (256) apple shirt big apple apple big shirt shirt. (1246) big apple shirt big shirt apple. (1244). (200) resample invert pope leader cat carnivore. (1385) leader pope cat leader. (1385) resample 11/17

Our data sets most frequent BLESS WordNet most frequent resample big former. (256) apple shirt big apple apple big shirt shirt. (1246) big apple shirt big shirt apple. (1244). (200) resample invert pope leader cat carnivore. (1385) leader pope cat leader. (1385) resample all both each either every few many most much no several some. 11/17

Our data sets most frequent BLESS WordNet most frequent resample big former. (256) apple shirt big apple apple big shirt shirt. (1246). (200) resample invert pope leader cat carnivore. (1385) leader pope cat leader. (1385) resample all some many several. (13) some every both many. (17) big apple shirt big shirt apple. (1244) 11/17

Our data sets most frequent BLESS WordNet most frequent resample big former. (256) apple shirt big apple apple big shirt shirt. (1246) big apple shirt big shirt apple. (1244). (200) resample invert pope leader cat carnivore. (1385) leader pope cat leader. (1385) pope leader cat carnivore. (6402) resample all some many several. (13) some every both many. (17) all cat some cat many cat several cat. (7537) some cat every cat both cat many cat. (8455) all cat every leader 11/17

Our data sets most frequent BLESS WordNet most frequent resample big former. (256) apple shirt big apple apple big shirt shirt. (1246) AN = N big apple shirt big shirt apple. (1244) e t. (200) resample invert pope leader cat carnivore. (1385) N = N leader pope cat leader. (1385) e t pope leader cat carnivore. (6402) resample all some many several. (13) some every both many. (17) all cat some cat many cat several cat. (7537) QN = QN some cat every cat both cat many cat. (8455) (e t) t 11/17

Our data sets train test N = N train test AN = N e t train e t test QN = QN (e t) t 11/17

Results at noun type P R F Accuracy (95% C.I.) SVM upper 88.6 88.6 88.5 88.6 (87.3 89.7) balapinc AN N 65.2 87.5 74.7 70.4 (68.7 72.1) balapinc upper 64.4 90.0 75.1 70.1 (68.4 71.8) SVM AN N 69.3 69.3 69.3 69.3 (67.6 71.0) cos(n 1, N 2 ) 57.7 57.6 57.5 57.6 (55.8 59.5) fq(n 1 ) < fq(n 2 ) 52.1 52.1 51.8 53.3 (51.4 55.2) 12/17

Holding out QN data all both each either every few many most much no several some all both each either every few many most much no several some 13/17

Holding out QN data all both each either every few many most much no several some all both each either every few many most much no several some pair-out 13/17

Holding out QN data all both each either every few many most much no several some all both each either every few many most much no several some pair-out quantifier-out 13/17

Results at quantifier type P R F Accuracy (95% C.I.) SVM pair-out 76.7 77.0 76.8 78.1 (77.5 78.8) SVM quantifier-out 70.1 65.3 68.0 71.0 (70.3 71.7) SVM Q pair-out 67.9 69.8 68.9 70.2 (69.5 70.9) SVM Q quantifier-out 53.3 52.9 53.1 56.0 (55.2 56.8) cos(qn 1, QN 2 ) 52.9 52.3 52.3 53.1 (52.3 53.9) balapinc AN N 46.7 5.6 10.0 52.5 (51.7 53.3) SVM AN N 2.8 42.9 5.2 52.4 (51.7 53.2) fq(qn 1 )<fq(qn 2 ) 51.0 47.4 49.1 50.2 (49.4 51.0) balapinc upper 47.1 100 64.1 47.2 (46.4 47.9) 14/17

Holding out each quantifier Quantifier Instances Correct each 656 656 649 637 (98%) every 460 1322 402 1293 (95%) much 248 0 216 0 (87%) all 2949 2641 2011 2494 (81%) several 1731 1509 1302 1267 (79%) many 3341 4163 2349 3443 (77%) few 0 461 0 311 (67%) most 928 832 549 511 (60%) some 4062 3145 1780 2190 (55%) no 0 714 0 380 (53%) both 636 1404 589 303 (44%) either 63 63 2 41 (34%) Total 15074 16910 9849 12870 (71%) 15/17

Our questions answered Entailment among composite phrases rather than nouns? Yes. Entailment among logical words rather than content words? Yes. Different entailment relations at different semantic types? Yes. AN == N N big cat cat N == N N dog animal QN == QN QN many dogs some dogs QN == QN QN all cats several cats 16/17

Our questions answered Entailment among composite phrases rather than nouns? Yes. (Cheap training data!) Practical import Entailment among logical words rather than content words? Yes. (Part of Recognizing Textual Entailment?) Practical import Different entailment relations at different semantic types? Yes. (Prediction from formal semantics.) AN == N N big cat cat N == N N dog animal QN == QN QN many dogs some dogs QN == QN QN all cats several cats 16/17

Our questions answered Entailment among composite phrases rather than nouns? Yes. (Cheap training data!) Practical import Entailment among logical words rather than content words? Yes. (Part of Recognizing Textual Entailment?) Practical import Different entailment relations at different semantic types? Yes. (Prediction from formal semantics.) Ongoing work: How does the SVM work? Missing experiments? How to compose semantic vectors? 16/17

Holding out each quantifier pair Quantifier pair Instances Correct all = some 1054 1044 (99%) all = several 557 550 (99%) each = some 656 647 (99%) all = many 873 772 (88%) much = some 248 217 (88%) every = many 460 400 (87%) many = some 951 822 (86%) all = most 465 393 (85%) several = some 580 439 (76%) both = some 573 322 (56%) many = several 594 113 (19%) most = many 463 84 (18%) both = either 63 1 (2%) Quantifier pair Instances Correct some = every 484 481 (99%) several = all 557 553 (99%) several = every 378 375 (99%) some = all 1054 1043 (99%) many = every 460 452 (98%) some = each 656 640 (98%) few = all 157 153 (97%) many = all 873 843 (97%) both = most 369 347 (94%) several = few 143 134 (94%) both = many 541 397 (73%) many = most 463 300 (65%) either = both 63 39 (62%) many = no 714 369 (52%) some = many 951 468 (49%) few = many 161 33 (20%) both = several 431 63 (15%) 17/17