Collaborative Georeferencing
The goal of this project is to provide a mechanism
whereby groups of users can form communities to
collaboratively georeference and verify a shared dataset.
This collaborative georeferencing framework consists of
two end-user components:
- The GEOLocate web-based collaborative client for reviewing and editing community records.
- Web-based data management portal
for creating and managing communities, their
respective users and data sources
Shared community datasets created via the portal may
consist of multiple underlying data sources from either
live DiGIR providers and/or uploaded text files. Support
for TAPIR providers is currently under development. Data
are stored using the full Darwin Core 1.2 specification,
but subsets and/or alternatives schemas may be imported
using the schema mapping interface. During import, data
items are automatically normalized, georeferenced and
related to one another via a similarity index. This index
is used to identify all records that appear to describe
the same collection locality regardless of syntax. During
coordinate verification, users have the option to
re-classify records that were incorrectly related to one
another.
Verification and correction of the computer generated
geographic coordinates is accomplished using the GEOLocate
desktop application. GEOLocate allows users to login to
their communities, retrieve and visualize results, make
any necessary corrections, provide additional comments,
define errors as polygons, and save the results back to
the shared dataset. The verified results of georeferencing
can then be downloaded via the portal’s data management
interface for re-import to the parent database.
To examine the gains in efficiency over traditional
georeferencing 2100 randomly selected collecting events
from the TUMNH fish collection were imported and
georeferenced using the collaborative georeferencing
framework. The TUMNH fish collection was georeferenced by
hand in the mid to late 90’s and therefore provides a
useful test bed for assessing the efficiency and accuracy
of automated methodologies. Of the 2100 records, 30% were
identified as being similar to other records and an
additional 33% were duplicates leaving a total of 782
unique locations requiring correction, a 63% reduction in
effort overall.
Video Tutorials
- Using the Collaborative Georeferencing Web Client: