vendredi 27 mars 2015

Extracting Head nouns



I wonder how we can extract Head Nouns? I used a Constituency parser that did not work but I guess I have to use a dependency parser. I ran this demo code but it gives me a wrong answer.



public class dependencydemo {
public static void main(String[] args) throws IOException {
PrintWriter out;
if (args.length > 1) {
out = new PrintWriter(args[1]);
} else {
out = new PrintWriter(System.out);
}



StanfordCoreNLP pipeline = new StanfordCoreNLP();
Annotation annotation;
if (args.length > 0) {
annotation = new `
Annotation(IOUtils.slurpFileNoExceptions(args[0]));`
} else {
annotation = new Annotation("Yesterday, I went to the Dallas `Country Club to play 25 cent Bingo. While I was there I talked to my `friend Jim and we both agree that those people in Washington are `destroying our economy.");`
}

pipeline.annotate(annotation);
pipeline.prettyPrint(annotation, out);


List<CoreMap> sentences = `annotation.get(CoreAnnotations.SentencesAnnotation.class);`
if (sentences != null && sentences.size() > 0) {
CoreMap sentence = sentences.get(0);
Tree tree = `sentence.get(TreeCoreAnnotations.TreeAnnotation.class);`
// out.println();
// out.println("The first sentence parsed is:");
tree.pennPrint(out);
}
}


Output:



(ROOT
(S
(NP-TMP (NN Yesterday))
(, ,)
(NP (PRP I))
(VP (VBD went)
(PP (TO to)
(NP (DT the) (NNP Dallas) (NNP Country) (NNP Club)))
(S
(VP (TO to)
(VP (VB play)
(S
(NP (CD 25) (NN cent))
(NP (NNP Bingo)))))))
(. .)))


Dependencies:



root(ROOT-0, went-4)
tmod(went-4, Yesterday-1)
nsubj(went-4, I-3)
det(Club-9, the-6)
nn(Club-9, Dallas-7)
nn(Club-9, Country-8)
prep_to(went-4, Club-9)
aux(play-11, to-10)
xcomp(went-4, play-11)
num(cent-13, 25-12)
nsubj(Bingo-14, cent-13)
xcomp(play-11, Bingo-14)


How can I extract Head-Nouns out of it? aside from it seems that the output is not correct.




Aucun commentaire:

Enregistrer un commentaire