View project onGitHub


To intialize the spatial data, you should have PostGIS installed. The library was tested on Postgres 9.3.4 with PostGIS 2.12 on Mac OS X.

The easiest way to import the data needed by the spatial components is to click the "spatial" box in the graphical loader tool. However, you can also configure and initialize the spatial components manually.

After installing PostGIS, create a new database, connect to the new database and run the following SQL to enable spatial support:

Then, go to your configuration file and configure the settings corresponding to your PostGIS settings. For example:

spatial : {

    dao : {

        dataSource : {

                // These all use keys standard to Geotools JDBC
                // see:

                #change this part according to your DB settings
                default : postgis
                postgis : {
                    dbtype : postgis
                    host : localhost
                    port : 5432
                    schema : public
                    database : wikibrain
                    user : toby
                    passwd : ""
                    max connections : 19

Now you can load the spatial layer by running the following command. Note that if you haven't loaded the concept or wikidata stages (required by the spatial components), they will also be loaded.

org.wikibrain.Loader -s spatial

Try running CalculateGeographicDistanceBetweenPages. If it runs correctly, the spatial module is successfully initialized.

Integrating new layers with WikiBrain (or updating existing integrated layers)

TODO: translate these notes into something coherent.

  1. Make sure the dataset information is correct in the reference.conf's spatial.datasets configuration.
  2. Run the ShapeFileMatcher with the name of the reference system, layer, and datasetname:
org.wikibrain.spatial.matcher.ShapeFileMatcher earth country naturalEarth
  1. Go to the dat/spatial/<refSys>/<layerGroup>/ directory, and edit the csv file called <datasetName>.wbmapping.csv. You can take a look at an example for natural earth countries in this Google Spreadsheet. The important columns are as follows:
  • WB_STATUS: Starts as "U" (unknown) when automatically matched. Change to "V" (verified) after you are certain the matching is correct.
  • WB_TITLE: should contain the title of the correct Wikipedia article. Defaults to WikiBrain's best guess.
  • WB_GUESS1 / WB_GUESS2 / WB_GUESS3: Alternate guesses about the title.
  • WB_SCORE: A score indicating WikiBrain's confidence in its match.

You should only change WB_STATUS and WB_TITLE. You can also reorder the spreadsheet rows.