Enterprise Products > Geographic Text Search (GTS)

Summary

MetaCarta Geographic Text Search (GTS) combines traditional text (keyword) search with powerful geographic search so users can find content about a place and view the results on a map.

GTS is a robust geosearch platform that enables knowledge workers, consumers, intelligence analysts, mission planners and law enforcement agents to instantly find documents from massive collections simply by zooming a map into their geographic area of interest. The product works with a wide range of mapping systems to enable efficient access to content including Web pages, articles, military message traffic, etc. across file shares, databases, and documents stored in content management systems.

  • Self-contained Appliance that installs directly on a network
  • Allows users to quickly find relevant documents using a combination of keywords and a map
  • Includes default user interface with world maps for rapid deployment
  • Integrates with map servers for customers with their own maps; Microsoft Virtual Earth, Google Earth, ESRI ArcGIS, and OpenLayers

MetaCarta GTS identifies implied and explicit references to geographic locations within documents, assigns latitude/longitude coordinates to the references, indexes the document, and then enables a search for indexed documents through a graphical user interface.

Core Components:

  • Base Geographic Data Module (GDM) - MetaCarta’s data is the key to performing geographic searches with MetaCarta GTS
  • ESRI ArcMap Extension - Enables seamless geographic text searches from within ESRI’s ArcMap interface
  • 3 Connectors of your choice (see below)
  • Available in various storage requirements and document counts

 

Optional Components:

Optional Connectors:

  • GTS Database Connector
  • GTS Documentum Connector
  • GTS Livelink Connector
  • GTS RSS Connector
  • GTS Sharepoint Connector
  • GTS Web Crawler
  • GTS Windows Share Crawler – Installed off the Appliance
  • GTS Windows Share Connector – Installed on the Appliance

User Interface

MetaCarta GTS is “map agnostic” – the product integrates with map servers to display search results.

Results from a search query appear as icons on a digital map and entries in a results list. The location of each document icon coincides with the geographic locations mentioned within the document. If a user wants to find all documents relative to a geographic area - a country, city, LAT/LON pair or U.S. street address - MetaCarta GTS renders a map appropriately speckled with icons representing every document that includes text pertaining to the identified location. By clicking on a document icon, a user gains direct access to the original document.

Processing Documents

MetaCarta GTS ingests documents from a variety of sources, including Internet and intranet sites, local hard drives and network shares, CD-ROMs and DVDs, databases, and content management systems. Typically, the system administrator performs an initial ingestion of the organization's various document collections and then establishes procedures for periodic updates.

GTS retains a record of each document's source, which it presents as a link when that document is referenced in a search result set. GTS can also retain the access permissions of ingested documents and filters result sets to ensure that documents are available only to authorized users. GTS can ingest many types of text file, including:

  • plain text (ASCII)
  • rich text
  • HTML
  • Adobe Acrobat (pdf)
  • PostScript
  • XML
  • Microsoft Office files (Word, PowerPoint, Excel)

Using natural language processing, MetaCarta GTS examines text strings and their context. The product also examines each document for temporal references, and extracts the first such reference that can be resolved to the day. The date extracted is the first formatted temporal reference recognized.

MetaCarta GTS extracts or computes the following items for each document:

  • Geographic references, which may be placenames or other forms of geographic annotation such as coordinates, military grid references, etc.
  • A latitude and longitude for each geographic reference
  • A “geoconfidence” score for each geographic reference, which is the estimated probability that the assigned latitude and longitude are correct
  • An emphasis score for each geographic reference and each keyword reference, which is the estimated prominence of the reference in the document
  • The document's first recognized temporal reference

The final step in GTS processing pipeline is indexing. The system builds specialized search indexes called CartaTrees™, which allow documents to be rapidly retrieved in response to user queries that are some combination of geographic extent, keyword, and temporal factors.