Send
Close Add comments:
(status displays here)
Got it! This site "robinsnyder.com" uses cookies. You consent to this by clicking on "Got it!" or by continuing to use this website. Note: This appears on each machine/browser from which this site is accessed.
Natural Language Parsing examples
1. Natural Language Parsing examples
2. Natural Language Parsing examples
3. Part of speech
To stimulate ideas, here is some output from some available NLP and POS systems.
The default settings were used. There are many ways to customize and tweak the system depending on the domain of application.
4. Stanford NLP group
The leading NLP group is at Stanford University, at the following URL.
Their software is summarized and available from the following URL.
5. NLTK
The
NLTK (Natural Language Tool Kit) is available at the following URL.
"
NLTK is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning."
6. Example sentences
Here ares some example sentences.
This article contains a discussion of the history of commercial and academic efforts to automate patent classifications. It also suggests new approaches (adding additional structured language to the text) that (it asserts) lead to statistically meaningful improvements.
7. Parse trees
Here are the parse tree from NLTK for the first sentence.
8. Stanford parser
Here is the output from the Stanford Parser, an "
implementations of probabilistic natural language parsers in Java: highly optimized PCFG and dependency parsers, a lexicalized PCFG parser, and a deep learning reranker".
(ROOT
(S
(NP (DT This) (NN article))
(VP (VBZ contains)
(S
(NP
(NP (DT a) (NN discussion))
(PP (IN of)
(NP
(NP (DT the) (NN history))
(PP (IN of)
(NP
(UCP (JJ commercial)
(CC and)
(JJ academic))
(NNS efforts))))))
(VP (TO to)
(VP (VB automate)
(NP (NN patent) (NNS classifications))))))
(. .)))
det(article-2, This-1)
nsubj(contains-3, article-2)
root(ROOT-0, contains-3)
det(discussion-5, a-4)
nsubj(automate-15, discussion-5)
det(history-8, the-7)
prep_of(discussion-5, history-8)
amod(efforts-13, commercial-10)
conj_and(commercial-10, academic-12)
amod(efforts-13, academic-12)
prep_of(history-8, efforts-13)
aux(automate-15, to-14)
xcomp(contains-3, automate-15)
nn(classifications-17, patent-16)
dobj(automate-15, classifications-17)
(ROOT
(S
(NP (PRP It))
(ADVP (RB also))
(VP (VBZ suggests)
(NP
(NP (JJ new) (NNS approaches))
(PRN (-LRB- -LRB-)
(VP (VBG adding)
(NP (JJ additional) (JJ structured) (NN language))
(PP (TO to)
(NP (DT the) (NN text))))
(-RRB- -RRB-))
(SBAR
(WHNP (WDT that))
(S
(PRN (-LRB- -LRB-)
(S
(NP (PRP it))
(VP (VBZ asserts)))
(-RRB- -RRB-))
(VP (VBP lead)
(PP (TO to)
(NP
(ADJP (RB statistically) (JJ meaningful))
(NNS improvements))))))))
(. .)))
nsubj(suggests-3, It-1)
advmod(suggests-3, also-2)
root(ROOT-0, suggests-3)
amod(approaches-5, new-4)
dobj(suggests-3, approaches-5)
nsubj(lead-20, approaches-5)
dep(approaches-5, adding-7)
amod(language-10, additional-8)
amod(language-10, structured-9)
dobj(adding-7, language-10)
det(text-13, the-12)
prep_to(adding-7, text-13)
nsubj(asserts-18, it-17)
parataxis(lead-20, asserts-18)
rcmod(approaches-5, lead-20)
advmod(meaningful-23, statistically-22)
amod(improvements-24, meaningful-23)
prep_to(lead-20, improvements-24)
9. Stanford tagger
Here is the output from the Stanford POS Tagger, a "
maximum-entropy (CMM) part-of-speech (POS) tagger for English, Arabic, Chinese, French, and German, in Java ".
This_DT
article_NN
contains_VBZ
a_DT
discussion_NN
of_IN
the_DT
history_NN
of_IN
commercial_JJ
and_CC
academic_JJ
efforts_NNS
to_TO
automate_VB
patent_NN
classifications_NNS
._.
It_PRP
also_RB
suggests_VBZ
new_JJ
approaches_NNS
-LRB-_-LRB-
adding_VBG
additional_JJ
structured_JJ
language_NN
to_TO
the_DT
text_NN
-RRB-_-RRB-
that_WDT
-LRB-_-LRB-
it_PRP
asserts_VBZ
-RRB-_-RRB-
lead_NN
to_TO
statistically_RB
meaningful_JJ
improvements_NNS
._.
10. NLTK taggers
Using the
NLTK, here is the parts of NLTK and Stanford speech tagger (via NLTK API) output using the default settings, along with the differences between the two taggers.
NLTK tagger: (default settings)
0. This : DT = Determiner
1. article : NN = Noun, singular or mass
2. contains : VBZ = Verb, 3rd person singular present
3. a : DT = Determiner
4. discussion : NN = Noun, singular or mass
5. of : IN = Preposition or subordinating conjunction
6. the : DT = Determiner
7. history : NN = Noun, singular or mass
8. of : IN = Preposition or subordinating conjunction
9. commercial : JJ = Adjective
10. and : CC = Coordinating conjunction
11. academic : JJ = Adjective
12. efforts : NNS = Noun, plural
13. to : TO = to
14. automate : VB = Verb, base form
15. patent : NN = Noun, singular or mass
16. classifications. : NNP = Proper noun, singular
17. It : NNP = Proper noun, singular
18. also : RB = Adverb
19. suggests : VBZ = Verb, 3rd person singular present
20. new : JJ = Adjective
21. approaches : NNS = Noun, plural
22. ( : VBP = Verb, non-3rd person singular present
23. adding : VBG = Verb, gerund or present participle
24. additional : JJ = Adjective
25. structured : JJ = Adjective
26. language : NN = Noun, singular or mass
27. to : TO = to
28. the : DT = Determiner
29. text : NN = Noun, singular or mass
30. ) : : = Colon or ellipsis
31. that : IN = Preposition or subordinating conjunction
32. ( : CD = Cardinal number
33. it : PRP = Personal pronoun
34. asserts : VBZ = Verb, 3rd person singular present
35. ) : : = Colon or ellipsis
36. lead : NN = Noun, singular or mass
37. to : TO = to
38. statistically : RB = Adverb
39. meaningful : JJ = Adjective
40. improvements : NNS = Noun, plural
41. . : . = Termator
Stanford tagger: (default settings)
0. This : DT = Determiner
1. article : NN = Noun, singular or mass
2. contains : VBZ = Verb, 3rd person singular present
3. a : DT = Determiner
4. discussion : NN = Noun, singular or mass
5. of : IN = Preposition or subordinating conjunction
6. the : DT = Determiner
7. history : NN = Noun, singular or mass
8. of : IN = Preposition or subordinating conjunction
9. commercial : JJ = Adjective
10. and : CC = Coordinating conjunction
11. academic : JJ = Adjective
12. efforts : NNS = Noun, plural
13. to : TO = to
14. automate : VB = Verb, base form
15. patent : JJ = Adjective
16. classifications. : NN = Noun, singular or mass
17. It : PRP = Personal pronoun
18. also : RB = Adverb
19. suggests : VBZ = Verb, 3rd person singular present
20. new : JJ = Adjective
21. approaches : NNS = Noun, plural
22. ( : VBP = Verb, non-3rd person singular present
23. adding : VBG = Verb, gerund or present participle
24. additional : JJ = Adjective
25. structured : JJ = Adjective
26. language : NN = Noun, singular or mass
27. to : TO = to
28. the : DT = Determiner
29. text : NN = Noun, singular or mass
30. ) : NN = Noun, singular or mass
31. that : WDT = Wh-determiner
32. ( : VBZ = Verb, 3rd person singular present
33. it : PRP = Personal pronoun
34. asserts : VBZ = Verb, 3rd person singular present
35. ) : JJ = Adjective
36. lead : NN = Noun, singular or mass
37. to : TO = to
38. statistically : RB = Adverb
39. meaningful : JJ = Adjective
40. improvements : NNS = Noun, plural
41. . : . = Termator
Differences:
15. patent : NN = Noun, singular or mass
15. patent : JJ = Adjective
16. classifications. : NNP = Proper noun, singular
16. classifications. : NN = Noun, singular or mass
17. It : NNP = Proper noun, singular
17. It : PRP = Personal pronoun
30. ) : : = Colon or ellipsis
30. ) : NN = Noun, singular or mass
31. that : IN = Preposition or subordinating conjunction
31. that : WDT = Wh-determiner
32. ( : CD = Cardinal number
32. ( : VBZ = Verb, 3rd person singular present
35. ) : : = Colon or ellipsis
35. ) : JJ = Adjective
11. End of page