Most geocoding services, such as Geonames or the Googel Maps API, are well optimized for resolving jurisdictions, business, addresses, and natural features like lakes and rivers However, geological features such as shields, cratons, faults, and folds are not represented in this data. A variety of use cases exist for geographic search of geological features, from data integration to discovery, but no appropriate tool exists.
This presentation will outline a prototype built in Elsevier Labs entirely with Open Source tools and technologies. Mining data from an internal database of geolocated map insets from journal articles, we built a 350,000 entry Geological Entity Location Engine. Our approach utilized geohashes generated at various levels of granularity for nearly 600,000 georeferenced maps, aligning these geohashes with entities extracted from map captions. These were loaded to ElasticSearch, which was used to aggregate the pairings and start separating signal from noise in the resulting dataset. All development was done using FOSS, including Python libraries Pandas, Shapely, Goehash, and SpaCy. Our presentation will include a demo of our capacity to geocode based on both text entry and document upload.