text mining - Lucene Entity Extraction -
given finite dictionary of entity terms, i'm looking way entity extraction intelligent tagging using lucene. i've been able use lucene for:
- searching complex phrases fuzzyness
- highlighting results
however, 'm not aware how to:
-get accurate offsets of matched phrases
-do entity-specific annotaions per match(not tags every single hit)
i have tried using explain() method - gives terms in query got hit - not offsets of hit within original text.
has faced similar problem , willing share potential solution?
thank in advance help!
for offset, see question: how offset of term in lucene?
i don't quite understand second question. sounds me want data stored field though. data stored field:
topdocs results = searcher.search(query, filter, num); foreach (scoredoc result in results.scoredocs) { document resultdoc = searcher.doc(result.doc); string valoffield = resultdoc.get("my field"); }
Comments
Post a Comment