last thing we do is geoparse the texts
- extract place names
- resolve them to structured geographic information
Mordecai library does this with NN that resolve place names to the correct places based on the context
how does mordecai work?
- uses spaCy's named entity recognition to extract placenames from the text.
- uses the geonames gazetteer in an Elasticsearch index to find the potential coordinates of extracted place names.
- uses neural networks implemented in Keras and trained on new annotated English-language data labeled with Prodigy to infer the correct country and correct gazetteer entries for each placename.
on spaCy
*most of this is taken care of by the library → aside from some cleaning and reorganizing there’s minimal need for any changing after the library already extracts the relevant information