Welcome to Geo Engine Docs

Geo Engine is a cloud-ready geo-spatial data processing platform. This documentation presents the foundations of the system and how to use it.

The Geo Engine

Geo Engine is a cloud-ready geospatial data processing platform. Here, we give an overview of its architecture and describe the main components.

Architecture

Architecture

Geo Engine consists of the backend and several frontends. The backend is subdivided into three subcomponents: services, operators, and data types. Data types specify primitives like feature collections for vector or gridded raster data. Moreover, it defines plots and basic operations, e.g., projections. The Operators block contains the processing engine and operators, i.e., source operators, raster- and vector time series processing. Furthermore, there are raster time series stream adapters, which can be used as building blocks for operators. The Services block contains protocols, e.g., OGC standard interfaces, as well as Geo Engine specific interfaces. These can be workflow registration, plot queries, and data upload. Each of the subcomponents can have additions in Geo Engine Pro, for instance, User Management, which is only available in Geo Engine Pro.

Frontends for the Geo Engine are geoengine-ui for building web applications on top of Geo Engine. geoengine-python offers a Python library that can be used in Jupyter Notebooks. 3rd party applications like QGIS can access Geo Engine via its OGC interfaces.

All components of Geo Engine are fully containerized and Docker-ready. Geo Engine builds upon several technologies, including GDAL, arrow, Angular, and OpenLayers.

API

This chapter introduces the API of Geo Engine.

Workflows

This section introduces the workflow API of Geo Engine.

ResultDescriptor

Call /workflow/{workflowId}/metadata to get the result descriptor of the workflow. It describes the result of the workflow by data type, spatial reference, temporal and spatial extent and some more information that is specific to raster and vector results.

Example response for rasters

{
  "type": "raster",
  "dataType": "U8",
  "spatialReference": "EPSG:4326",
  "measurement": {
    "type": "unitless"
  },
  "time": {
    "start": "2014-01-01T00:00:00.000Z",
    "end": "2014-07-01T00:00:00.000Z"
  },
  "bbox": {
    "upperLeftCoordinate": [-180.0, 90.0],
    "lowerRightCoordinate": [180.0, -90.0]
  }
}

Example response for vectors

{
  "type": "vector",
  "dataType": "MultiPoint",
  "spatialReference": "EPSG:4326",
  "columns": {
    "id": "int",
    "name": "text",
    "value": "float"
  },
  "time": {
    "start": "2014-04-01T00:00:00.000Z",
    "end": "2014-07-01T00:00:00.000Z"
  },
  "bbox": {
    "lowerLeftCoordinate": [3.9662060000000001, 45.9030360000000002],
    "upperRightCoordinate": [19.171284, 51.8473430000000022]
  }
}

Datatypes

This chapter introduces the datatypes of Geo Engine.

Colorizer

A colorizer specifies a mapping between values and pixels/objects of an output image. Different variants of colorizers perform different kinds of mapping.

Usually, there are two miscellaneous fields in each colorizer, namely noDataColor and defaultColor. The field noDataColor is used for all missing, NaN or no data values. The defaultColor is used for all overflowing values, for instance, if there are breakpoints defined from 0 to 10, but a value of -5 or 11 is mapped to a color.

Colors are defined as RGBA arrays, where the first three values refer to red, green and blue and the fourth one to alpha, which means transparency. The values range from 0 to 255. For instance, [255, 255, 255, 255] is opaque white and [0, 0, 0, 127] is semi-transparent black.

Linear Gradient

A linear gradient linearly interpolates values within breakpoints of a color table.

Example JSON

{
  "type": "linearGradient",
  "breakpoints": [
    {
      "value": 1.0,
      "color": [255, 255, 255, 255]
    },
    {
      "value": 2.0,
      "color": [0, 0, 0, 255]
    }
  ],
  "noDataColor": [0, 0, 0, 0],
  "defaultColor": [0, 0, 0, 0]
}

Logarithmic Gradient

A logarithmic gradient logarithmically interpolates values within breakpoints of a color table and allows only positive values.

Errors

Services report errors that try to use a logarithmic gradient specification with values where value <= 0.

Example JSON

{
  "type": "logarithmicGradient",
  "breakpoints": [
    {
      "value": 1.0,
      "color": [255, 255, 255, 255]
    },
    {
      "value": 2.0,
      "color": [0, 0, 0, 255]
    }
  ],
  "noDataColor": [0, 0, 0, 0],
  "defaultColor": [0, 0, 0, 0]
}

Palette

A palette maps values as classes to a certain color. Unmapped values result in the NO DATA color.

Example JSON

{
  "type": "palette",
  "colors": {
    "1": [255, 255, 255, 255],
    "2": [0, 0, 0, 255]
  },
  "noDataColor": [0, 0, 0, 0],
  "defaultColor": [0, 0, 0, 0]
}

Measurement

Measurements describe stored data, i.e. what is measured and in which unit.

Unitless

Some values do not have an associated measurement or no information is present.

Example JSON

{
  "type": "unitless"
}

Continuous

The type continuous specifies a continuous variable that is measured in a certain unit.

Example JSON

{
  "type": "continuous",
  "measurement": "Reflectance",
  "unit": "%"
}

Classification

A classification maps numbers to named classes.

Example JSON

{
  "type": "classification",
  "measurement": "Land Cover",
  "classes": {
    "0": "Grassland",
    "1": "Forest",
    "2": "Water"
  }
}

QueryRectangle

A query rectangle defines a multi-dimensional spatial query in Geo Engine. It consists of three parts:

  • a two-dimensional spatial bounds (and extent plus its spatial reference system),
  • a time interval,
  • a spatial resolution.

The spatial bounds behave differently for raster, vector, or plot queries. For raster queries, the spatial bounds define a spatial partition. This means the lower right corner of the spatial bounds is not included in the query. For vector queries, the spatial bounds define a bounding box, i.e., a rectangle where all bounds are included. Plot queries behave like vector queries.

Example JSON

{
  "spatial_bounds": {
    "upper_left_coordinate": {
      "x": 10.0,
      "y": 20.0
    },
    "lower_right_coordinate": {
      "x": 70.0,
      "y": 80.0
    }
  },
  "time_interval": {
    "start": "2010-01-01T00:00:00Z",
    "end": "2011-01-01T00:00:00Z"
  },
  "spatial_resolution": {
    "x": 1.0,
    "y": 1.0
  }
}

Raster Data Type

Rasters can have the following data types:

  • U8: unsigned 8-bit integer
  • I8: signed 8-bit integer
  • U16: unsigned 16-bit integer
  • I16: signed 16-bit integer
  • U32: unsigned 32-bit integer
  • I32: signed 32-bit integer
  • U64: unsigned 64-bit integer
  • I64: signed 64-bit integer
  • F32: 32-bit floating-point
  • F64: 64-bit floating-point

Example JSON

"U8"

Time Instance

A time instance is a single point in time. It is specified in UTC time zone 0 and has a maximum resolution of milliseconds.

Example JSON

Specifying in ISO 8601:

"2010-01-01T00:00:00Z"

Using the same date as a UNIX timestamp in milliseconds:

1262304000000

Time Interval

A time interval consists of two TimeInstances. Please be aware, that the interval is defined in close-open semantics. This means, that the start time is inclusive and the end time of the interval is exclusive. In mathematical notation, the interval is defined as [start, end).

Example JSON

Specifying in ISO 8601:

{
  "start": "2010-01-01T00:00:00Z",
  "end": "2011-01-01T00:00:00Z"
}

Using the same date as UNIX timestamps in milliseconds:

{
  "start": 1262304000000,
  "end": 1293840000000
}

Time Step

A time step consists of granularity and the number of steps. For instance, you can specify yearly steps by settings the granularity to Years and the number of steps to 1. Half-yearly steps can be specified by setting the granularity to Months and the number of steps to 6.

ParameterTypeDescriptionExample Value
granularityTimeGranularitygranularity of the time stepsmonths
stepintegernumber of time steps1

TimeGranularity

The granularity of the time steps can take one of the following values.

VariantDescription
millismilliseconds
secondsseconds
minutesminutes
hourshours
daysdays
monthsmonths
yearsyears

Example JSON

{
  "granularity": "months",
  "step": 1
}

Operators

This chapter introduces the operators of Geo Engine.

ColumnRangeFilter

The ColumnRangeFilter operator allows filtering FeatureCollections. Users can define one or more data ranges for a column in the data table that is then filtered. The filter can be used for numerical as well as textual columns. Each range is inclusive, i.e., [start, end] includes as well the start as the end.

For instance, you can filter a collection to only include column values that are either in the range 0-10 or 20-30. Moreover, you can specify the range a to k to dismiss all column values that start with larger letters in the alphabet.

Parameters

ParameterTypeDescriptionExample Value
columnstringa column name of the FeatureCollection
"precipitation"
rangesList of either string or number rangesone or more ranges of either strings or numbers; each range works as an or for the filter
[[42,43]]
keepNullsbooleanshould null values be kept or discarded?
true

Inputs

The ColumnRangeFilter operator expects exactly one vector input.

ParameterType
vectorSingleVectorSource

Errors

If the value in the column parameter is not a column of the feature collection, an error is thrown.

Example JSON

{
  "type": "ColumnRangeFilter",
  "params": {
    "column": "population",
    "ranges": [[1000, 10000]],
    "keepNulls": false
  },
  "sources": {
    "vector": {
      "type": "OgrSource",
      "params": {
        "data": {
          "type": "internal",
          "datasetId": "e977b123-ca47-4c5b-aace-481119826aaf"
        },
        "attributeProjection": ["name", "population"]
      }
    }
  }
}
{
  "type": "ColumnRangeFilter",
  "params": {
    "column": "name",
    "ranges": [
      ["a", "k"],
      ["v", "z"]
    ],
    "keepNulls": false
  },
  "sources": {
    "vector": {
      "type": "OgrSource",
      "params": {
        "data": {
          "type": "internal",
          "datasetId": "e977b123-ca47-4c5b-aace-481119826aaf"
        },
        "attributeProjection": ["name", "population"]
      }
    }
  }
}

Raster Expression

The Expression operator performs a pixel-wise mathematical expression on one or more raster sources. The expression is specified as a user-defined script in a very simple language. The output is a raster time series with the result of the expression and with time intervals that are the same as for the inputs. Users can specify an output data type. Internally, the expression is evaluated using floating-point numbers.

An example usage scenario is to calculate NDVI for a red and a near-infrared raster channel. The expression uses two raster sources, referred to as A and B, and calculates the formula (A - B) / (A + B). When the temporal resolution is months, our output NDVI will also be a monthly time series.

Parameters

ParameterTypeDescriptionExample Value
expressionExpressionExpression script
(A - B) / (A + B)
outputTypeRasterDataTypeA raster data type for the output
U8
outputMeasurementMeasurementDescription about the output
{
  "type": "continuous",
  "measurement": "NDVI"
}
mapNoDataboolShould NO DATA values be mapped with the expression? Otherwise, they are mapped automatically to NO DATA.
false

Types

The following describes the types used in the parameters.

Expression

Expressions are simple scripts to perform pixel-wise computations. One can refer to the raster inputs as A for the first raster, B for the second, and so on. Furthermore, expressions can check with A IS NODATA, B IS NODATA, etc. for NO DATA values. This is important if mapNoData is set to true. Otherwise, NO DATA values are mapped automatically to the output NO DATA value. Finally, the value NODATA can be used to output NO DATA.

Users can think of this implicit function signature for, e.g., two inputs:

fn (A: f64, B: f64) -> f64

As a start, expressions contain algebraic operations and mathematical functions.

(A + B) / 2

In addition, branches can be used to check for conditions.

if A IS NODATA {
    B
} else {
    A
}

Function calls can be used to access utility functions.

max(A, 0)

Currently, the following functions are available:

  • abs(a): absolute value
  • min(a, b), min(a, b, c): minimum value
  • max(a, b), max(a, b, c): maximum value
  • sqrt(a): square root
  • ln(a): natural logarithm
  • log10(a): base 10 logarithm
  • cos(a), sin(a), tan(a), acos(a), asin(a), atan(a): trigonometric functions
  • pi(), e(): mathematical constants
  • round(a), ceil(a), floor(a): rounding functions
  • mod(a, b): division remainder
  • to_degrees(a), to_radians(a): conversion to degrees or radians

To generate more complex expressions, it is possible to have variable assignments.

let mean = (A + B) / 2;
let coefficient = 0.357;
mean * coefficient

Note, that all assignments are separated by semicolons. However, the last expression must be without a semicolon.

Inputs

The Expression operator expects one to eight raster inputs.

ParameterType
ASingleRasterSource
BSingleRasterSource
CSingleRasterSource
SingleRasterSource

Errors

The parsing of the expression can fail if there are, e.g., syntax errors.

Example JSON

{
  "type": "Expression",
  "params": {
    "expression": "(A - B) / (A + B)",
    "outputType": "F32",
    "mapNoData": false
  },
  "sources": {
    "A": {
      "type": "GdalSource",
      "params": {
        "data": {
          "type": "internal",
          "datasetId": "a626c880-1c41-489b-9e19-9596d129859c"
        }
      }
    },
    "B": {
      "type": "GdalSource",
      "params": {
        "data": {
          "type": "internal",
          "datasetId": "699b9e14-4bd6-4d57-889a-58f60288b19c"
        }
      }
    }
  }
}

GdalSource

The GdalSource is a source operator that reads raster data using GDAL. The counterpart for vector data is the OgrSource.

Parameters

ParameterTypeDescriptionExample ValueDefault Value
dataDataIdThe id of the data to be loaded
{
  "type": "internal",
  "datasetId": "a626c880-1c41-489b-9e19-9596d129859c"
}

Inputs

None

Errors

If the given dataset does not exist or is not readable, an error is thrown.

Example JSON

{
  "type": "GdalSource",
  "params": {
    "data": {
      "type": "internal",
      "datasetId": "a626c880-1c41-489b-9e19-9596d129859c"
    }
  }
}

Interpolation

The Interpolation operator artificially increases the resolution of a raster by interpolating the values of the input raster. If the operator is queried with a resolution that is coarser than the input resolution, the interpolation is not applied but the input raster is returned unchanged. Unless a particular input resolution is specified, the resolution of the input raster is used, if it is known.

Parameters

ParameterTypeDescriptionExample Value
interpolationInterpolationMethodthe interpolation method to be used"nearestNeighbor"
inputResolutionInputResolutionthe query resolution for the source operator"source"

Types

The following describes the types used in the parameters.

InterpolationMethod

The operator supports the following interpolation methods:

ValueDescription
nearestNeighborThe value of the nearest neighbor is used.
biLinearThe value is computed by bilinear interpolation.

InputResolution

The operator supports the following input resolutions:

ValueDescription
{"type": "source"}The resolution of the input raster is used.
{"type": "value", "x": 0.1, "y": 0.1}The resolution is specified explicitly.

Inputs

The Interpolation operator expects exactly one raster input.

ParameterType
sourceSingleRasterSource

Errors

If the input resolution is set as "source" but the resolution of the input raster is not known, an error will be thrown.

Example JSON

{
  "type": "Raster",
  "operator": {
    "type": "Interpolation",
    "params": {
      "interpolation": "biLinear",
      "inputResolution": {
        "type": "source"
      }
    },
    "sources": {
      "raster": {
        "type": "GdalSource",
        "params": {
          "data": {
            "type": "internal",
            "datasetId": "36574dc3-560a-4b09-9d22-d5945f2b8093"
          }
        }
      }
    }
  }
}

Neighborhood Aggregate

The NeighborhoodAggregate operator computes an aggregate function for a pixel and its neighborhood. The operator can be defined as a neighborhood matrix with either weights or predefined shapes and an aggregate function.

An example usage scenario is to calculate a Gaussian filter to smoothen or blur an image. For each time step in the raster time series, the operator computes the aggregate for each pixel and its neighborhood.

The output data type is the same as the input data type. As the matrix and the aggregate in- and outputs are defined as floating point values, the internal computation is done as floating point calculations.

Parameters

ParameterTypeDescriptionExample Value
neighborhoodNeighborhoodPixel neighborhood specification
{
  "type": "weightsMatrix",
  "weights": [
    [1.0, 2.0, 3.0],
    [4.0, 5.0, 6.0],
    [7.0, 8.0, 9.0]
  ]
}
aggregateFunctionAggregateFunctionAn aggregate function for a set of values
"sum"

Types

The following describes the types used in the parameters.

Neighborhood

There are several types of neighborhoods. They define a matrix of weights. The rows and columns of this matrix must be odd.

WeightsMatrix

The weights matrix is defined as an \( n \times m \) matrix of floating point values. It is applied to the pixel and its neighborhood to serve as the input for the aggregate function.

For instance, a vertical derivative filter (a component of a Sobel filter) can be defined like this:

{
  "type": "weightsMatrix",
  "weights": [
    [1.0, 0.0, -1.0],
    [2.0, 0.0, -2.0],
    [1.0, 0.0, -1.0]
  ]
}

The aggregate function should be sum in this case.

Rectangle

The rectangle neighborhood is defined by its shape \( n \times m \). The result is a weights matrix with all weights set to 1.0.

{
  "type": "rectangle",
  "dimensions": [3, 3]
}

AggregateFunction

The aggregate function computes a single value from a set of values. The following aggregate functions are supported:

  • sum: The sum of all values
  • standardDeviation: The standard deviation of all values. This ignores NO DATA values.

Inputs

The NeighborhoodAggregate operator expects exactly one raster input.

ParameterType
sourceSingleRasterSource

Errors

If the neighborhood rows or columns are not positive or odd, an error will be thrown.

Example JSON

{
  "type": "NeighborhoodAggregate",
  "params": {
    "neighborhood": {
      "type": "weightsMatrix",
      "weights": [
        [1.0, 2.0, 3.0],
        [4.0, 5.0, 6.0],
        [7.0, 8.0, 9.0]
      ]
    },
    "aggregateFunction": "sum"
  },
  "sources": {
    "raster": {
      "type": "GdalSource",
      "params": {
        "data": {
          "type": "internal",
          "datasetId": "8d01593c-75c0-4ffa-8152-eabfe4430817"
        }
      }
    }
  }
}

OgrSource

The OgrSource is a source operator that reads vector data using OGR. The counterpart for raster data is the GdalSource.

Parameters

ParameterTypeDescriptionExample ValueDefault Value
dataDataIdThe id of the data to be loaded
{
  "type": "internal",
  "datasetId": "e977b123-ca47-4c5b-aace-481119826aaf"
}
attributeProjectionArray<String>(Optional) The list of attributes to load. If nothing is specified, all attributes will be loaded.["name", "population"]
attributeFiltersArray<AttributeFilter>(Optional) The list of filters to apply on the attributes of features. Only the features that match all of the filters will be loaded.
[{"attribute": "population",
"ranges": [[1000, 10000]]
}]

Types

The following describes the types used in the parameters.

AttributeFilter

The AttributeFilter defines one or more ranges on the values of an attribute. The ranges include the lower and upper bounds of the range.

FieldTypeDescription
attributeStringThe name of the attribute to filter.
rangesArray<Array<String \| Number>>The list of ranges to filter.
keepNullsbool(Optional) Specifies whether to keep null/no data entries, defaults to false.

Inputs

None

Errors

If the given dataset does not exist or is not readable, an error is thrown.

Example JSON

{
  "type": "OgrSource",
  "params": {
    "data": {
      "type": "internal",
      "datasetId": "e977b123-ca47-4c5b-aace-481119826aaf"
    },
    "attributeProjection": ["name", "population"],
    "attributeFilters": [
      {
        "attribute": "population",
        "ranges": [[1000, 10000]],
        "keepNulls": false
      }
    ]
  }
}

PointInPolygon

The PointInPolygon operator filters point features of a (multi-)point collection with polygons. In more detail, the points of each feature are checked against the polygons of the other collection. If one or more point is included in any polygon's ring, the feature is included in the output.

For instance, you can filter tree features inside the polygons of a forest. All features, that weren't inside any forest polygon, are considered either part of another forest or outliers and are thus removed.

Parameters

The operator is parameterless.

Inputs

The PointInPolygon operator expects two vector inputs.

ParameterType
pointsSingleVectorSource
polygonsSingleVectorSource

Errors

If the points vector input is not a (multi-)point feature collection, an error is thrown.

If the polygons vector input is not a (multi-)polygon feature collection, an error is thrown.

Example JSON

{
  "type": "PointInPolygon",
  "params": {},
  "sources": {
    "points": {
      "type": "OgrSource",
      "params": {
        "data": {
          "type": "internal",
          "datasetId": "e977b123-ca47-4c5b-aace-481119826aaf"
        },
        "attributeProjection": ["name", "population"]
      }
    },
    "polygons": {
      "type": "OgrSource",
      "params": {
        "data": {
          "type": "internal",
          "datasetId": "b6191257-6d61-4c6b-90a4-ebfb1b23899d"
        }
      }
    }
  }
}

RasterTypeConversion

The RasterTypeConversion operator allows changing the data type of raster data. It transforms all pixels into the new data type.

  1. Applying the operator could lead to a loss of precision, e.g., converting a F32 value of 3.1 to a U8 will return a value of 3.

  2. If the old value is not valid in the new type it will clip at the value range of the new type. E.g., converting a F32 value of 300.0 to a U8 will return a value of 255.

Parameters

ParameterTypeDescriptionExample Value
outputDataType[RasterDataType]the output type"U8"

The RasterTypeConversion operator expects exactly one raster input.

ParameterType
sourceSingleRasterSource

Example JSON

{
  "type": "RasterTypeConversion",
  "params": {
    "outputDataType": "U8"
  },
  "sources": {
    "source": {
      "type": "GdalSource",
      "params": {
        "data": {
          "type": "internal",
          "datasetId": "00000000-0000-0000-0000-000000000539"
        }
      }
    }
  }
}

RasterScaling

The raster scaling operator scales/unscales the values of a raster by a given slope factor and offset. This allows to shrink and expand the value range of the pixel values needed to store a raster. It also allows to shift values to all-positive values and back. We use the GDAL terms of scale and unscale. Raster data is often scaled to reduce memory/storage consumption. To get the "real" raster values the unscale operation is applied. Keep in mind that scaling might reduce the precision of the pixel values. (To actually reduce the size of the raster, use the raster type conversion operator and transform to a smaller datatype after scaling.)

The operator applies the following formulas to every pixel.

For unscaling the formula is: p_new = p_old * slope + offset.

For scaling the formula is: p_new = (p_old - offset) / slope

p_old and p_new refer to the old and new pixel value. The slope and offset values are either properties attached to the input raster or a fixed value.

An example for Meteosat Second Generation properties is:

  • offset: msg.calibration_offset
  • slope: msg.calibration_slope

Parameters

ParameterTypeDescriptionExample Value
slopeMetadataKeyOrConstantthe key or value to use for slope{"type": "metadataKey" "domain": "", "key": "scale" }
offsetMetadataKeyOrConstantthe key or value to use for offset{"type": "constant" "value": 0.1 }
scalingModescale OR unscaleselect scale or unscale mode"scale"
outputMeasurement*(optional) Measurementthe measurement of the data produced by the operator{"type": "continuous", "measurement": "Reflectance","unit": "%"}

* if no outputMeasurement is given, the measurement of the input raster is used.

The RasterScaling operator expects exactly one raster input.

ParameterType
sourceSingleRasterSource

Types

The following describes the types used in the parameters.

MetadataKeyOrConstant

The MetadataKeyOrConstant type is used to specify a metadata key or a constant value.

ValueDescription
{"type": "constant", "value": number}A constant value.
{"type": "metadataKey", "domain": string, "key": string}A metadata key to lookup dynamic values from raster (tile) properties.

Example JSON

{
  "type": "RasterScaling",
  "params": {
    "slope": {
      "type": "metadataKey",
      "domain": "",
      "key": "scale"
    },
    "offset": {
      "type": "value",
      "value": 1.0
    },
    "outputMeasurement": null,
    "scalingMode": "scale"
  },
  "sources": {
    "source": {
      "type": "GdalSource",
      "params": {
        "data": {
          "type": "internal",
          "datasetId": "00000000-0000-0000-0000-000000000539"
        }
      }
    }
  }
}

Reprojection

The Reprojection operator reprojects data from one spatial reference system to another. It accepts exactly one input which can either be a raster or a vector data stream. The operator produces all data that, after reprojection, is contained in the query rectangle.

Data type specifics

The concrete behavior depends on the data type.

Vector data

The reprojection operator reprojects all coordinates of the features individually. The result contains all features that, after reprojection, are intersected by the query rectangle. If not all coordinates of the vector data stream could be projected, the operator returns an error.

Raster data

To create tiles in the target projection, the operator first loads the corresponding tiles in the source projection. Note, that in order to create one reprojected output tile, it may be necessary to load multiple source tiles. For each output pixel, the operator takes the value of the input pixel nearest to its upper left corner.

In order to obtain precise results but avoid loading too much data, the operators estimate the resolution in which it loads the input raster stream. The estimate is based on the target resolution defined by the query rectangle and the relationship between the length of the diagonal of the query rectangle in both projections. Please refer to the source code for details.

In case a tile, or part of a tile, is not available in the source projection because it is outside of the defined extent, the operator will produce pixels with no data values. If the input raster stream has no no data value defined, the value 0 will be used instead.

Parameters

ParameterTypeDescriptionExample Value
targetSpatialReferenceStringThe srs string (authority:code) of the target spatial reference.EPSG:4326

Inputs

The Reprojection operator expects exactly one raster or vector input.

ParameterType
sourceRasterOrVectorOperator

Errors

The operator returns an error if the target projection is unknown or if the input data cannot be reprojected.

Example JSON

{
  "type": "Reprojection",
  "params": {
    "targetSpatialReference": "EPSG:4326"
  },
  "sources": {
    "source": {
      "type": "GdalSource",
      "params": {
        "data": {
          "type": "internal",
          "datasetId": "a626c880-1c41-489b-9e19-9596d129859c"
        }
      }
    }
  }
}

TemporalRasterAggregation

The TemporalRasterAggregation aggregates a raster time series into uniform time intervals (windows). The output is a time series that begins with the first window that contains the start of the query time. Each time slice has the same length, defined by the window parameter. The pixel values are computed by aggregating all rasters that are contained in the input and that are valid in the current window using the defined aggregation method. All output slices that are contained in the query time interval are produced by the operator. The optional windowReference parameter allows specifying a custom anchor point for the windows. This is the imagined start from which on the timeline is divided into uniform aggregation windows. By default, it is 1970-01-01T00:00:00Z which means that windows of, e.g., 1 hour or 1 month will begin at the full hour or the start of the month.

An example usage scenario is to transform a daily raster time series into monthly aggregates. Here, the query should start at the beginning of the month and the window should be 1 month. The aggregation method allows calculating, e.g., the maximum or mean value for each pixel. If we perform a query with time [2021-01-01, 2021-04-01), we would get a time series with three time steps. If we perform a query with an instant like [2021-01-01, 2021-01-01), we will get a single time step containing the aggregated values for January 2021.

Parameters

ParameterTypeDescriptionExample Value
aggregationAggregationmethod for aggregating pixels
{
  "type": "max",
  "ignoreNoData": false
}
windowTimeSteplength of time steps
{
  "granularity": "Months",
  "step": 1
}
windowReferenceTimeInstance(Optional) anchor point for the aggregation windows. Default value is 1970-01-01T00:00:00Z1970-01-01T00:00:00Z
outputTypeRasterDataType(Optional) A raster data type for the output. Same as input, if not specified.
U8

Types

The following describes the types used in the parameters.

Aggregation

There are different methods that can be used to aggregate the raster time series. Encountering a no data value makes the aggregation value of a pixel also no data unless the ignoreNoData parameter is set to true.

VariantParametersDescription
minignoreNoData: boolminimum value
maxignoreNoData: boolmaximum value
firstignoreNoData: boolfirst encountered value
lastignoreNoData: boollast encountered value
meanignoreNoData: boolmean value
sumignoreNoData: boolsum of the values
countignoreNoData: boolcount the number of values

Attention: For the variants sum and count, a saturating addition is used. This means, that if the sum of two values exceeds the maximum value of the data type, the result will be the maximum value of the data type. Thus, users must be aware to choose a data type that is large enough to hold the result of the aggregation.

Inputs

The TemporalRasterAggregation operator expects exactly one raster input.

ParameterType
rasterSingleRasterSource

Errors

If the aggregation method is first, last, or mean and the input raster has no no data value, an error is thrown.

Example JSON

{
  "type": "TemporalRasterAggregation",
  "params": {
    "aggregation": {
      "type": "max",
      "ignoreNoData": false
    },
    "window": {
      "granularity": "Months",
      "step": 1
    },
    "windowReference": "1970-01-01T00:00:00Z",
    "sources": {
      "raster": {
        "type": "GdalSource",
        "params": {
          "data": {
            "type": "internal",
            "datasetId": "a626c880-1c41-489b-9e19-9596d129859c"
          }
        }
      }
    }
  }
}

TimeProjection

The TimeProjection projects vector dataset timestamps to new granularities and ranges. The output is a new vector dataset with the same geometry and attributes as the input. However, each time step is projected to a new time range. Moreover, the QueryRectangle's temporal extent is enlarged as well to include the projected time range.

An example usage scenario is to transform snapshot observations into yearly time slices. For instance, animal occurrences are observed at a daily granularity. If you want to aggregate the data to a yearly granularity, you can use the TimeProjection operator. This will change the validity of each element in the dataset to the full year where it was observed. This is, for instance, useful when you want to combine it with raster time series and use different temporal semantics than the originally recorded validities.

Parameters

ParameterTypeDescriptionExample Value
stepTimeSteptime granularity and size for the projection
{
  "granularity": "years",
  "step": 1
}
stepReferenceTimeInstance(Optional) an anchor point for the time step
"2010-01-01T00:00:00Z"

Inputs

The TimeProjection operator expects exactly one vector input.

ParameterType
vectorSingleVectorSource

Errors

If the step is negative, an error is thrown.

Example JSON

{
  "type": "TimeProjection",
  "params": {
    "step": {
      "granularity": "years",
      "step": 1
    }
  },
  "sources": {
    "vector": {
      "type": "OgrSource",
      "params": {
        "data": {
          "type": "internal",
          "datasetId": "a626c880-1c41-489b-9e19-9596d129859c"
        }
      }
    }
  }
}

TimeShift

The TimeShift operator allows retrieving data temporally relative to the actual QueryRectangle. It shifts the query rectangle by a given amount of time and modifies the result data accordingly. Users have two options for specifying the time shift:

  1. Relative shift – shift relatively to the query rectangle, e.g., one month or one year to the past. This can be useful for comparing multiple points in time relative to the query rectangle.
  2. Absolute shift – change query rectangle to a fixed temporal reference, e.g., January 2014. This can be used to compare data in the query rectangle's time to a fixed point of reference.

The output is either a stream of raster data or a stream of vector data depending on the input.

An example usage scenario is to compare the current time with the previous time of the same raster data. For instance, a raster source outputs monthly data aggregates of mean temperatures. If you want to compute the difference between the current month and the previous month, you can use the TimeShift operator. You will have two workflows. One is the unmodified temperature raster source. The other is the same source, shifted by one month. Then, you can use both workflows as sources of an Expression operator.

Note: This operator modifies the time values of the returned data. For rasters and vector data, it shifts the time intervals opposite to the time shift specified in the operator. This is necessary to have only data inside the result that is part of the QueryRectangle's time interval. As an example, we shift monthly data by one month to the past. Our query rectangle points to February. Then, the operator shifts the query rectangle to January. The data, originally valid for January, is shifted forward to February again, to fit into the original query rectangle, which is February. Time Shift

Parameters

ParameterTypeDescriptionExample Value
typerelative or absoluteshift relatively or absolute
"relative"

Relative

If type is relative, you need to specify the following parameters:

ParameterTypeDescriptionExample Value
granularityTimeGranularitytime granularity and for the shift
"months"
valueintegerthe size of the step
-1

Absolute

If the type is absolute, you need to specify the following parameters:

ParameterTypeDescriptionExample Value
timeIntervalTimeIntervalA fixed shift of the QueryRectangle's time
{
  "start": "2010-01-01T00:00:00Z",
  "end": "2010-02-01T00:00:00Z"
}

Inputs

The TimeShift operator expects either one vector input or one raster input.

ParameterType
sourceSingleRasterOrVectorSource

Example JSON

{
  "type": "TimeShift",
  "params": {
    "type": "relative",
    "granularity": "months",
    "value": -1
  },
  "sources": {
    "source": {
      "type": "GdalSource",
      "params": {
        "data": {
          "type": "internal",
          "datasetId": "00000000-0000-0000-0000-000000000539"
        }
      }
    }
  }
}
{
  "type": "TimeShift",
  "params": {
    "type": "absolute",
    "time_interval": {
      "start": "2010-01-01T00:00:00Z",
      "end": "2010-02-01T00:00:00Z"
    }
  },
  "sources": {
    "source": {
      "type": "GdalSource",
      "params": {
        "data": {
          "type": "internal",
          "datasetId": "00000000-0000-0000-0000-000000000539"
        }
      }
    }
  }
}

VectorJoin

The VectorJoin operator allows combining multiple vector inputs into a single feature collection. There are multiple join variants defined, which are described below.

For instance, you want to join tabular data to a point collection of buildings. The point collection contains the geolocation of the buildings and their id. The attribute data collection has the building id and the height information. Combining the two feature collections leads to a single point collection with geolocation and height information.

Parameters

ParameterTypeDescriptionExample Value
typeA value of EquiGeoToData, …The type of join
"EquiGeoToData"

EquiGeoToData

ParameterTypeDescriptionExample Value
leftColumnstringThe column name of the left input
"id"
rightColumnstringThe column name of the right input
"id"
rightColumn_suffix(Optional) stringA value to suffix the right join column to avoid name clashes with the columns of the left input. If nothing is specified, the default value is right.
"right"

Inputs

The VectorJoin operator expects two vector inputs.

ParameterType
leftSingleVectorSource
rightSingleVectorSource

Errors

If the value in the left parameter is not a column of the left feature collection, an error is thrown.

If the value in the right parameter is not a column of the right feature collection, an error is thrown.

EquiGeoToData

If the left input is not a geo data collection, an error is thrown.

If the right input is not a (non-geo) data collection, an error is thrown.

Example JSON

{
  "type": "VectorJoin",
  "params": {
    "type": "EquiGeoToData",
    "leftColumn": "id",
    "rightColumn": "id",
    "rightColumnSuffix": "_other"
  },
  "sources": {
    "points": {
      "type": "OgrSource",
      "params": {
        "data": {
          "type": "internal",
          "datasetId": "e977b123-ca47-4c5b-aace-481119826aaf"
        },
        "attributeProjection": ["name", "population"]
      }
    },
    "polygons": {
      "type": "OgrSource",
      "params": {
        "data": {
          "type": "internal",
          "datasetId": "b6191257-6d61-4c6b-90a4-ebfb1b23899d"
        }
      }
    }
  }
}

VisualPointClustering

The VisualPointClustering is a clustering operator for point collections that removes clutter and preserves the spatial structure of the input. The output is a point collection with a count and radius attribute. The operator utilizes the input resolution of the query to determine when points, being displayed as circles, would overlap. Moreover, it allows aggregating non-geo attributes to preserve the other columns of the input. For more information on the algorithm, cf. the paper Beilschmidt, C. et al.: A Linear-Time Algorithm for the Aggregation and Visualization of Big Spatial Point Data. SIGSPATIAL/GIS 2017: 73:1-73:4.

An exemplary use case for this operator is the visualization of point data in an online map application. There, you can use this operator as the final step of the workflow to cluster the points and display them as circles. These circles then pose a decluttered view of the data, e.g., via a WFS endpoint.

Parameters

ParameterTypeDescriptionExample Value
minRadiusPxnumberMinimum circle radius in screen pixels
10
deltaPxnumberMinimum circle to circle distance in screen pixels input
1
radiusColumnstringThe new column name to store radius information in screen pixels
"__radius"
countColumnstringThe new column name to store the number of points represented by each circle
"__count"
columnAggregatesMap from string to aggregate definition (one of MeanNumber, StringSample or Null)Specify how miscellaneous columns should be aggregated. You can optionally set a new Measurement. Otherwise, the Measurement is taken from the source column.
{
  "foo": {
    "columnName": "numericColumn",
    "aggregateType": "MeanNumber",
    "measurement": { "type": "unitless" }
  },
  "bar": {
    "columnName": "textColumn",
    "aggregateType": "StringSample"
  }
}

Inputs

The VisualPointClustering operator expects exactly one vector input that must be a point collection.

ParameterType
vectorSingleVectorSource

Errors

If the source value vector is not a point collection, an error is thrown.

If multiple columns in columnAggregates have the same names, an error is thrown.

Example JSON

{
  "type": "VisualPointClustering",
  "params": {
    "minRadiusPx": 8.0,
    "deltaPx": 1.0,
    "radiusColumn": "__radius",
    "countColumn": "__count",
    "columnAggregates": {
      "mean_population": {
        "columnName": "population",
        "aggregateType": "MeanNumber",
        "measurement": { "type": "unitless" }
      },
      "sample_names": {
        "columnName": "name",
        "aggregateType": "StringSample"
      }
    }
  },
  "sources": {
    "vector": {
      "type": "OgrSource",
      "params": {
        "data": {
          "type": "internal",
          "datasetId": "e977b123-ca47-4c5b-aace-481119826aaf"
        },
        "attributeProjection": ["name", "population"]
      }
    }
  }
}

Plots

Plots are special kinds of operators that generate visualizations.

Geo Engine supports three output types:

  • jsonPlain: structured output in JSON format
  • jsonVega: a Vega-Lite visualization (cf. Vega-Lite)
  • imagePng: a PNG image

Thus, plots can contain statistics, visualizations, and images.

BoxPlot

The BoxPlot is a plot operator that computes a box plot over

  • a selection of numerical columns of a single vector dataset, or
  • multiple raster datasets.

Thereby, the operator considers all data in the given query rectangle.

The boxes of the plot span the 1st and 3rd quartile and highlight the median. The whiskers indicate the minimum and maximum values of the corresponding attribute or raster.

Vector Data

In the case of vector data, the operator generates one box for each of the selected numerical attributes. The operator returns an error if one of the selected attributes is not numeric.

Parameter

ParameterTypeDescriptionExample Value
columnNamesVec<String>The names of the attributes to generate boxes for.["x","y"]

Raster Data

For raster data, the operator generates one box for each input raster.

Parameter

ParameterTypeDescriptionExample Value
columnNamesVec<String>Optional: An alias for each input source. The operator will automatically name the boxes Raster-1, Raster-2, ... if this parameter is empty. If aliases are given, the number of aliases must match the number of input rasters. Otherwise an error is returned.["A","B"].

Inputs

The operator consumes exactly one vector or multiple raster operators.

ParameterType
sourceMultipleRasterOrSingleVectorSource

Errors

The operator returns an error in the following cases.

  • Vector data: The attribute for one of the given columnNames is not numeric.
  • Vector data: The attribute for one of the given columnNames does not exist.
  • Raster data: The length of the columnNames parameter does not match the number of input rasters.

Notes

If your dataset contains infinite or NAN values, they are ignored for the computation. Moreover, if your dataset contains more than 10.000values (which is likely for rasters), the median and quartiles are estimated using the P^2 algorithm described in:

R. Jain and I. Chlamtac, The P^2 algorithm for dynamic calculation of quantiles and histograms without storing observations, Communications of the ACM, Volume 28 (October), Number 10, 1985, p. 1076-1085. https://www.cse.wustl.edu/~jain/papers/ftp/psqr.pdf

Example JSON

Vector

{
  "type": "BoxPlot",
  "params": {
    "columnNames": ["x", "y"]
  },
  "sources": {
    "source": {
      "type": "OgrSource",
      "params": {
        "data": {
          "type": "internal",
          "datasetId": "a626c880-1c41-489b-9e19-9596d129859c"
        }
      }
    }
  }
}

Raster

{
  "type": "BoxPlot",
  "params": {
    "columnNames": ["A", "B"]
  },
  "sources": {
    "source": [
      {
        "type": "GdalSource",
        "params": {
          "data": {
            "type": "internal",
            "datasetId": "a626c880-1c41-489b-9e19-9596d129859c"
          }
        }
      },
      {
        "type": "GdalSource",
        "params": {
          "data": {
            "type": "internal",
            "datasetId": "a626c880-1c41-489b-9e19-9596d129859c"
          }
        }
      }
    ]
  }
}

ClassHistogram

The ClassHistogram is a plot operator that computes a histogram plot either over categorical attributes of a vector dataset or categorical values of a raster source. The output is a plot in Vega-Lite specification.

For instance, you want to plot the frequencies of the classes of a categorical attribute of a feature collection. Then you can use a class histogram to visualize and assess this.

Parameters

ParameterTypeDescriptionExample Value
columnNamestring (optional)The name of the attribute making up the x-axis of the histogram. Must be set for a vector sources, must not be set for rasters."temperature"

Inputs

The operator consumes either one vector or one raster operator.

ParameterType
sourceSingleRasterOrVectorSource

Errors

The operator returns an error if…

  • the selected column (columnName) does not exist or is not numeric,
  • the source is a raster and the property columnName is set, or
  • the input Measurement is not categorical.

The operator returns an error if

Notes

The operator only uses values of the categorical Measurement. It ignores missing or no-data values and values that are not covered by the Measurement.

Example JSON

{
  "type": "ClassHistogram",
  "params": {
    "columnName": "foobar"
  },
  "sources": {
    "vector": {
      "type": "OgrSource",
      "params": {
        "data": {
          "type": "internal",
          "datasetId": "a626c880-1c41-489b-9e19-9596d129859c"
        }
      }
    }
  }
}

FeatureAttributeValuesOverTime

The FeatureAttributeValuesOverTime is a plot operator that computes a multi-line plot for feature attribute values over time. For distinguishing features, the data requires an id column. The output is a plot in Vega-Lite specification.

Feature Attribute Values Over Time

For instance, you want to plot the NDVI values of a feature collection of trees. Then, you can use a multi-line plot to visualize the trees by their id.

Parameters

ParameterTypeDescriptionExample Value
idColumnstringThe column name of the id attribute (one line per id.)"id"
valueColumnstringThe column name of the value attribute (y-axis values)."temperature"

Inputs

The operator consumes exactly one vector operator.

ParameterType
vectorSingleVectorSource

Errors

The operator returns an error if the selected columns ( idColumn and valueColumn) do not exist or valueColumn is not numeric.

Notes

The operator processes a maximum of 20 different ids. After recognizing more than 20 different ids, the operator ignores the rest.

Example JSON

{
  "type": "FeatureAttributeValuesOverTime",
  "params": {
    "idColumn": "id",
    "valueColumn": "temperature"
  },
  "sources": {
    "vector": {
      "type": "OgrSource",
      "params": {
        "data": {
          "type": "internal",
          "datasetId": "a626c880-1c41-489b-9e19-9596d129859c"
        }
      }
    }
  }
}

Histogram

The Histogram is a plot operator that computes a histogram plot either over attributes of a vector dataset or values of a raster source. The output is a plot in Vega-Lite specification.

For instance, you want to plot the data distribution of numeric attributes of a feature collection. Then you can use a histogram with a suitable number of buckets to visualize and assess this.

Parameters

ParameterTypeDescriptionExample Value
columnNamestring, ignored for raster inputThe name of the attribute making up the x-axis of the histogram."temperature"
boundsHistogramBounds (either data or specified values)If data, it computes the bounds of the underlying data. If values, one can specify custom bounds.
{
  "min": 0.0,
  "max": 20.0
}
"data"
buckets(Optional) numberThe number of buckets. The value is calculated, if not specified.20
interactive(Optional) booleanFlag, if the histogram should have user interactions for a range selection. It is false by default.true

Inputs

The operator consumes either one vector or one raster operator.

ParameterType
sourceSingleRasterOrVectorSource

Errors

The operator returns an error if the selected column (columnName) does not exist or is not numeric.

Notes

If bounds or buckets are not defined, the operator will determine these values by itself which requires processing the data twice.

If the buckets parameter is unset, the operator estimates it using the square root of the number of elements in the data.

Example JSON

{
  "type": "Histogram",
  "params": {
    "columnName": "foobar",
    "bounds": {
      "min": 5.0,
      "max": 10.0
    },
    "buckets": 15,
    "interactive": false
  },
  "sources": {
    "vector": {
      "type": "OgrSource",
      "params": {
        "data": {
          "type": "internal",
          "datasetId": "a626c880-1c41-489b-9e19-9596d129859c"
        }
      }
    }
  }
}

MeanRasterPixelValuesOverTime

The MeanRasterPixelValuesOverTime is a plot operator that computes a time series plot of mean raster values. For each time step in the raster time series, it computes one mean value. The output is a plot in Vega-Lite specification.

For instance, you want to plot the mean temperature of a monthly raster time series. Then, you can use this operator to generate a time series plot.

Parameters

ParameterTypeDescriptionExample Value
timePositionstring (start, center or end)Where should the x-axis (time) tick be positioned? At either time start, time end or in the center."start"
area(Optional) booleanWhether to fill the area under the curve. Defaults to true.false

Inputs

The operator consumes exactly one raster operator.

ParameterType
rasterSingleRasterSource

Example JSON

{
  "type": "MeanRasterPixelValuesOverTime",
  "params": {
    "timePosition": "start",
    "area": true
  },
  "sources": {
    "raster": {
      "type": "GdalSource",
      "params": {
        "data": {
          "type": "internal",
          "datasetId": "a626c880-1c41-489b-9e19-9596d129859c"
        }
      }
    }
  }
}

ScatterPlot

The ScatterPlot is a plot operator that computes a scatter plot over two attributes of a vector dataset. Thereby, the operator considers all data in the given query rectangle.

In case of more than 500 points to plot, the representation changes from a regular scatter plot to a 2D Histogram with buckets determined from the underlying data.

Parameters

ParameterTypeDescriptionExample Value
columnXStringThe name of the attribute making up the x-axis of the plot."width"
columnYStringThe name of the attribute making up the y-axis of the plot."height"

Inputs

The operator consumes exactly one vector operator.

ParameterType
sourceSingleVectorSource

Errors

The operator returns an error if one of the selected columns does not exist or is not numeric.

Notes

If your dataset contains infinite or NAN values, they are ignored for the computation. Moreover, if your dataset contains more than 10.000 values, the buckets of the histogram are generated based on those 10.000 values. Later values outside those bounds are ignored.

Example JSON

{
  "type": "ScatterPlot",
  "params": {
    "columnX": "width",
    "columnY": "height"
  },
  "sources": {
    "vector": {
      "type": "OgrSource",
      "params": {
        "data": {
          "type": "internal",
          "datasetId": "a626c880-1c41-489b-9e19-9596d129859c"
        }
      }
    }
  }
}

Statistics

The Statistics operator is a plot operator that computes count statistics over

  • a selection of numerical columns of a single vector dataset, or
  • multiple raster datasets.

The output is a JSON description.

For instance, you want to get an overview of a raster data source. Then, you can use this operator to get basic count statistics.

Vector Data

In the case of vector data, the operator generates one statistic for each of the selected numerical attributes. The operator returns an error if one of the selected attributes is not numeric.

Parameter

ParameterTypeDescriptionExample Value
columnNamesVec<String>The names of the attributes to generate statistics for.["x","y"]

Raster Data

For raster data, the operator generates one statistic for each input raster.

Parameter

ParameterTypeDescriptionExample Value
columnNamesVec<String>Optional: An alias for each input source. The operator will automatically name the rasters Raster-1, Raster-2, ... if this parameter is empty. If aliases are given, the number of aliases must match the number of input rasters. Otherwise an error is returned.["A","B"].

Inputs

The operator consumes exactly one vector or multiple raster operators.

ParameterType
sourceMultipleRasterOrSingleVectorSource

Errors

The operator returns an error in the following cases.

  • Vector data: The attribute for one of the given columnNames is not numeric.
  • Vector data: The attribute for one of the given columnNames does not exist.
  • Raster data: The length of the columnNames parameter does not match the number of input rasters.

Example JSON

{
  "type": "Statistics",
  "params": {
    "columnNames": ["A"]
  },
  "sources": {
    "source": [
      {
        "type": "GdalSource",
        "params": {
          "data": {
            "type": "internal",
            "datasetId": "a626c880-1c41-489b-9e19-9596d129859c"
          }
        }
      }
    ]
  }
}

Example Output

{
  "A": {
    "valueCount": 6,
    "validCount": 6,
    "min": 1.0,
    "max": 6.0,
    "mean": 3.5,
    "stddev": 1.707
  }
}