Welcome to Geo Engine Docs

Geo Engine is a cloud-ready geo-spatial data processing platform. This documentation presents the foundations of the system and how to use it.

The Geo Engine

Geo Engine is a cloud-ready geospatial data processing platform. Here, we give an overview of its architecture and describe the main components.

Architecture

Geo Engine consists of the backend and several frontends. The backend is subdivided into three subcomponents: services, operators, and data types. Data types specify primitives like feature collections for vector or gridded raster data. Moreover, it defines plots and basic operations, e.g., projections. The Operators block contains the processing engine and operators, i.e., source operators, raster- and vector time series processing. Furthermore, there are raster time series stream adapters, which can be used as building blocks for operators. The Services block contains protocols, e.g., OGC standard interfaces, as well as Geo Engine specific interfaces. These can be workflow registration, plot queries, and data upload. Each of the subcomponents can have additions in Geo Engine Pro, for instance, User Management, which is only available in Geo Engine Pro.

Frontends for the Geo Engine are geoengine-ui for building web applications on top of Geo Engine. geoengine-python offers a Python library that can be used in Jupyter Notebooks. 3rd party applications like QGIS can access Geo Engine via its OGC interfaces.

All components of Geo Engine are fully containerized and Docker-ready. Geo Engine builds upon several technologies, including GDAL, arrow, Angular, and OpenLayers.

Datasets

A dataset is a loadable unit in Geo Engine. It is a parameter of a source operator (e.g., a GdalSource) and identifies the data that is loaded. Geo Engine supports different types of data, reflected by a DataId, which refers to internal datasets and external data.

Internal dataset

An internal dataset is a dataset that is stored in the Geo Engine. Thus, it is efficiently accessible and can be used in workflows. The dataset is identified by a DatasetName and contains a DatasetDefinition that describes the data.

The DatasetName is a string that consists of a namespace (optional) and a name, separated by a colon. For instance, namespace:name or name refer to datasets. The name can consist of characters (a-Z & A-Z), numbers (0-9), dashes (-) and underscores (_).

External data

An external dataset is a dataset that is not stored in the Geo Engine. Geo Engine accesses it from a foreign location. The dataset is identified by an ExternalDataId that consists of a DataProviderId and a LayerId. While the DatasetProviderId is usually a UUID that identifies the data provider for Geo Engine itself, the LayerId is a string that identifies the layer in the data provider.

The ExternalDataId is a string that consists of a namespace, the DataProviderId and a name, separated by a colon. The namespace cannot be omitted and is _ for the global namespace. For instance, _:{uuid}:name or namespace:{uuid}:name refer to datasets. If the name is a complex string, it can be enclosed by backticks, e.g., namespace:{uuid}:`name with spaces`.

Layers

A layer is a browsable unit in Geo Engine. In general, it is a named Workflow with additional meta information like a description and a default Colorizer. Layers are identified by a LayerId, which is usually a UUID. Every layer can be part of one or more Layer collections.

Layer collections

Layer collections are groups of Layers. The collections themselves can be grouped inside other collections. Every layer collection has a name and a description. Layer collections, just like layers, can be part of one or more other layer collections.

Browsing

Inside Geo Engine's web interface, you can browse the available layers and layer collections when adding data.

Inside Python, you can use the

ge.layer_collection()

function to get a list of the root collection which contains paths to all underlying layers.

Pro Features

While much of Geo Engine's functionality is Open Source and freely usable, some parts are only available in the Pro version. To use the Pro version, you need to purchase a Pro license. You may, however, be eligible for a free academic license. Please contact us at info@geoengine.de to request one.

Users and Permission

The Pro version of Geo Engine includes a user management system. Users can either be anonymous or registered. On the first startup, an admin user will be created.

Geo Engine has a Role Based Access Control (RBAC) system. Users can have different roles and permissions on resources are granted to these roles. By default, they have a unique role for themselves and either the role anonymous or registered. The admin user has the role admin.

Geo Engine allows defining permissions for resources like Datasets, Workflows, Layers and Projects. When a resource is created, the creator gets the Owner permission. This means they can do everything with the resource, including deleting it and permitting others to use it. For read-only access, the Read permission is available. The management of the permissions is done via the Permissions API. Admin users, i.e. users with the role admin assigned to them, can create new roles and assign them to users. The management of roles is also done via the Permissions API. Please refer to the API documentation (TODO: link) for more information. Alternatively, you can also use our Python library to manage permissions. Please refer to the Python library documentation for more information.

Example

Let's say Alice creates a project P. She automatically gets the Owner permission assigned on the project to her user role. Then, she adds a Read permission for User Bob. Before the permission is added, the system checks for the Owner permission on project P. As Alice is the owner, this operation succeeds. When Bob tries to access the project P the system checks for the Read permission which again succeeds.

Alice now wants to grant Charly and and Dave the Read permission as well. Both Charly and Dave have the role Friends of Alice. She decides to give the permission to the role instead of both users individually. Both Charly and Dave can now access project P, but Mallory, who does not have the role gets a PermissionDenied error. When later on Erin gets the role R assigned, she automatically gains access to project P as well.

The complete permission scenario looks like this

Resources
- project P
Users
- Alice
- Bob
- Charly
- Dave
- Erin
- Mallory
Permissions (Role, Resource, Permission)
- Alice, project P, Owner
- Bob, project P, Read
- Friends of Alice, project P, Read
Roles
- User roles (omitted)
- Friends of Alice
  - Charly
  - Dave
Read access allowed
- Alice
- Bob
- Charly
- Dave
- Erin
Read access denied
- Mallory

API

This chapter introduces the API of Geo Engine.

Workflows

This section introduces the workflow API of Geo Engine.

ResultDescriptor

Call /workflow/{workflowId}/metadata to get the result descriptor of the workflow. It describes the result of the workflow by data type, spatial reference, temporal and spatial extent and some more information that is specific to raster and vector results.

Example response for rasters

{
  "type": "raster",
  "dataType": "U8",
  "spatialReference": "EPSG:4326",
  "measurement": {
    "type": "unitless"
  },
  "time": {
    "start": "2014-01-01T00:00:00.000Z",
    "end": "2014-07-01T00:00:00.000Z"
  },
  "bbox": {
    "upperLeftCoordinate": [-180.0, 90.0],
    "lowerRightCoordinate": [180.0, -90.0]
  }
}

Example response for vectors

{
  "type": "vector",
  "dataType": "MultiPoint",
  "spatialReference": "EPSG:4326",
  "columns": {
    "id": "int",
    "name": "text",
    "value": "float"
  },
  "time": {
    "start": "2014-04-01T00:00:00.000Z",
    "end": "2014-07-01T00:00:00.000Z"
  },
  "bbox": {
    "lowerLeftCoordinate": [3.9662060000000001, 45.9030360000000002],
    "upperRightCoordinate": [19.171284, 51.8473430000000022]
  }
}

Datatypes

This chapter introduces the datatypes of Geo Engine.

Colorizer

A colorizer specifies a mapping between values and pixels/objects of an output image. Different variants of colorizers perform different kinds of mapping. In general, there are two families of colorizers: gradient and palette. Gradients are used to interpolate a continuous spectrum of colors between explicitly stated tuples (breakpoints) of a value and a color. A palette colorizer on the other hand, is used to generate a discrete set of colors, each mapped to a specific value.

There are three miscellaneous fields in both of the gradient colorizers, namely noDataColor, overColor and underColor. The field noDataColor is used for all missing, NaN or no data values. The fields overColor and underColor are used for all overflowing values. For instance, if there are breakpoints defined from 0 to 10, but a value of -5 or 11 is mapped to a color, the respective field will be chosen instead. This way, you can specifically highlight values that lie outside of a given range.

For a palette colorizer, there are no overColor and underColor fields. If a given value does not match any entry in the palette's definition, it is mapped to the defaultColor. The noDataColor works in the same manner as in the gradiant variants.

Colors are defined as RGBA arrays, where the first three values refer to red, green and blue and the fourth one to alpha, which means transparency. The values range from 0 to 255. For instance, [255, 255, 255, 255] is opaque white and [0, 0, 0, 127] is semi-transparent black.

Linear Gradient

A linear gradient linearly interpolates values within breakpoints of a color table. For instance, the example below is showing a gradient representing the physical conditions of water at different temperatures. The gradient is defined between 0.0 and 99.99, where 0.0 is shown as a light blue and 99.99 as blue. Any value less than 0.0, hence being ice, is shown as white. Values above 99.99 are shown as a light gray.

Parameter	Type	Description	Example Value
`granularity`	`TimeGranularity`	granularity of the time steps	`months`
`step`	`integer`	number of time steps	1

Variant	Description
`millis`	milliseconds
`seconds`	seconds
`minutes`	minutes
`hours`	hours
`days`	days
`months`	months
`years`	years

Parameter	Type	Description	Example Value
`expression`	`Expression`	Expression script	`ln(1/x)`
`outputType`	`RasterDataType`	A raster data type for the output	`U8`
`mapNoData`	`bool`	Should NO DATA values be mapped with the `expression`? Otherwise, they are mapped automatically to NO DATA.	`false`

Parameter	Type	Description	Example Value
`column`	string	a column name of the `FeatureCollection`	"precipitation"
`ranges`	List of either string or number ranges	one or more ranges of either strings or numbers; each range works as an or for the filter	[[42,43]]
`keepNulls`	boolean	should null values be kept or discarded?	true