Creating a Map from Text
The Running Reality desktop app can find the geographic references in a text file, a capability of value to researchers bridging the narrative and graphical world.
Most often, historical material is text. Maps, databases, and other structured data comprise only a small fraction of the available historical information. Where Running Reality has extensive tools for operating on geographical data, such as a latitude and longitude coordinate or maybe a city or port name, it has traditionally hade fewer tools for narrative and text. We are rectifying this. Running Reality can now analyze a plain text file and produce a map of the locations mentioned in the text.
An emerging area of research tools available to those who wish to bridge narrative text and geographic information are geocoders. Geocoders, such as GeoNames can take a name as input and return a latitude and longitude. They generally specialize in data valuable to commercial or governmental interests, and not historians. Tools that build on top of such systems to analyze whole sections of historical text can then generate a map of all the locations mentioned in that text.
Running Reality can perform such an analysis, using its internal geocoder.
|The internal Running Reality geocoder can match text to 250,000+ historical objects.|
|The GeoNames dataset is vast and focuses on modern names, but it also has extensive historical names and names in a wide range of alphabets and languages.|
|The Pelagios dataset focuses specifically on historical names in the region of ancient Rome and Greece.|
Linked open data (LOD) is an important concept for the future of digital history research. The Pelagios Network, in particular, emphasizes the importance of being able to crosswalk data from different digital history projects to each other using common reference identifiers as the links. Further, they have been working to develop ways to link historical documents to new digital history tools like Running Reality.
The Geocoders are at the core of this effort because their identifier codes are linked to a wide range of alternative names for a location. For example, GeoNames has the identifier 264371 for the city of Athens, Greece, and links that identifier to approximately 75 alternate names and spelling for the city in different languages, alphabets, and historical periods.
Running Reality interlinks with these Geocoders in multiple ways and use them to therefore interlink with other open datasets like Wikipedia. Use Running Reality to identify locations in a text your are using for research that can then be used as part of the wider linked open data ecosystem.
Use the Add Data Source menu to import a plain text file as a layer. When the layer is loaded, Running Reality will check the text against the available geocoders to find the locations mentioned in the text. The result is a map layer with markers for each location mentioned. This layer can be styled like any other Running Reality layer to change the appearance of the markers. The import process and the layer style can be changed from the editing toolbar that appears when you click the Edit button below the text.
Checking each word in the text would be a very long, slow process. In English, may words like "the," "towards," and "beyond" are definitely not names and so therefore the 100 most common English words are not checked. For more advanced import options, Running Reality will check its more comprehensive geocoder; however, that makes the check for each word even slower. So, it performs the more comprehensive check only on words that are most likely to be proper names of locations, which means words that start with capital letters.
|Basic Name Matching||
|This is a fast but sloppy algorithm that checks nearly all the words in your text against the 250,000+ Running Reality name set.|
|Power Name Matching
|This is a slower algorithm that does the Basic check and then checks capitalized words against the GeoNames geocoder, which has names in more languages and alphabets.|
Development is underway to use the machine learning algorithms of the Apache Open Natural Language Processing (OpenNLP) project to better identify proper names and multi-word proper names.
|GeoJSON||A file with a feature for each location mentioned in the text, with a point geometry and a name property.||.geojson||
We could use your assistance to help our team ensure that the latest research can be accessible to our audience. If you were to contribute your data to Running Reality it would be very valuable to the project and would provide citation links back to your primary sources and research material. So, after you perform a text analysis using Running Reality, we would love for you to consider sharing that same data in the form of factoids.
If you can not find an answer here, or would like to request a feature, please feel free to ask us for help. Send us an email if you would like us to get back to you with a response: