pywaterinfo package

Waterinfo

class pywaterinfo.Waterinfo(provider: str = 'vmm', token: str | None = None, proxies: dict | None = None, cache: bool = False)[source]

Bases: object

clear_cache()[source]

Clean up the cache.

get_group_list(group_name=None, group_type=None, **kwargs)[source]

Get a list of time series and station groups

The function provides the existing group identifiers. These group_ids enable the user to request all values of a given group at the same time (method get_timeseries_value_layer or get_timeseries_values).

Parameters:
  • group_name (str) – Name of the time series group, can contain wildcards, e.g. ‘Download

  • group_type ('station' | 'parameter' | 'timeseries') – Specify the type station, parameter or timeseries

  • kwargs – Additional queryfields as accepted by the KIWIS call getGroupList, see API documentation

Returns:

DataFrame with an overview of the groups provided by the API

Return type:

pd.DataFrame

Examples

>>> from pywaterinfo import Waterinfo
>>> vmm = Waterinfo("vmm")
>>>
>>> # all available groupid's provided by VMM
>>> df = vmm.get_group_list()
>>>
>>> # all available groupid's provided by VMM that represent a time series
>>> df = vmm.get_group_list(group_type='timeseries')
>>>
>>> # all available groupid's  provided by VMM containing 'Download' in
>>> # the group name
>>> df = vmm.get_group_list(group_name='*Download*')
>>>
>>> hic = Waterinfo("hic")
>>>
>>> # all available groupid's provided by HIC
>>> df = hic.get_group_list()
get_timeseries_list(station_no=None, stationparameter_name=None, **kwargs)[source]

Get time series at given station an/or time series which provide certain parameter

The station_no and stationparameter_name are provided as arguments, as these represent our typical use cases: station_no and stationparameter_name are shown on the waterinfo.be download pages as respectively the ‘station_number’ and ‘parameter’ column.

By default all returnfields are provided in the returned dataframe, but this can be overridden by the user by providing the returnfields as an additional argument.

Parameters:
  • station_no (str) – single or multiple station_no values, comma-separated

  • stationparameter_name (str) – single or multiple stationparameter_name values, comma-separated

  • kwargs – Additional queryfields as accepted by the KIWIS call getTimeseriesList, see API docoumentation

Returns:

DataFrame with each row the time series metadata

Return type:

pd.DataFrame

Examples

>>> from pywaterinfo import Waterinfo
>>> vmm = Waterinfo("vmm")
>>>
>>> # for given station ME09_012, which time series are available?
>>> df = vmm.get_timeseries_list(station_no="ME09_012") 
>>>
>>> # for a given parameter PET, which time series are available?
>>> df = vmm.get_timeseries_list(parametertype_name="PET") 
>>>
>>> # for a given parameter PET and station ME09_012, which time series
>>> # are available?
>>> df = vmm.get_timeseries_list(parametertype_name="PET",
...                              station_no="ME09_012")
>>>
>>> # for a given parametertype_id 11502, which time series are available?
>>> df = vmm.get_timeseries_list(parametertype_id="11502")
>>>
>>> # only interested in a subset of the returned columns: ts_id, station_name,
>>> # stationparameter_longname
>>> df = vmm.get_timeseries_list(parametertype_id="11502",
...                returnfields="ts_id,station_name,stationparameter_longname")
>>>
>>> hic = Waterinfo("hic")
>>>
>>> # for a given parameter EC, which time series are available?
>>> df = hic.get_timeseries_list(parametertype_name="EC")
>>>
>>> # for a given station plu03a-1066, which time series are available?
>>> df = hic.get_timeseries_list(station_no="plu03a-1066")
get_timeseries_value_layer(timeseriesgroup_id=None, ts_id=None, bbox=None, **kwargs)[source]

Get metadata and last measured value for group of stations

Either ts_id, timeseriesgroup_id or bbox can be used to request data. The function provides metadata and the last measured value for the group of ids/stations.

Note, by using an additional ‘date’ argument, the data value of another moment can be requested as well.

Parameters:
  • ts_id (str) – single or multiple ts_id values, comma-separated

  • timeseriesgroup_id (str) – single or multiple group identifiers, comma-separated

  • bbox – Comma separated list with four values in order min_x, min_y, max_x, max_y; use ‘crs’ parameter to choose between local and global coordinates. fields stationparameter_no and ts_shortname are required for bbox; the function will select 0 or 1 timeseries per station in the area according to filters

  • kwargs

    Additional query parameter options as documented by KIWIS waterinfo API, see API documentation

Returns:

DataFrame with for each time series in the group a row containing measurement and metadata

Return type:

pd.DataFrame

Examples

>>> from pywaterinfo import Waterinfo
>>> vmm = Waterinfo("vmm")
>>>
>>> # get the metadata and last measured value on a single time series
>>> df = vmm.get_timeseries_value_layer(ts_id=78124042)
>>>
>>> # get the metadata and last measured value of all members of a
>>> # time series group
>>> df = vmm.get_timeseries_value_layer(timeseriesgroup_id=192928)
>>>
>>> # get the measured value of all members of a time series group on
>>> # a given time stamp
>>> df = vmm.get_timeseries_value_layer(timeseriesgroup_id=192928,
...                                     date="20190501")
>>>
>>> # Limit the number of returned fields/columns in response
>>> df = vmm.get_timeseries_value_layer("192780",
...     returnfields="timestamp,ts_value", metadata="false")
>>>
>>> hic = Waterinfo("hic")
>>>
>>> # get the metadata and last measured value of the oxygen concentration
>>> # (group id 156207) and conductivity (group id 156173) combined
>>> df = hic.get_timeseries_value_layer(timeseriesgroup_id="156207,156173")
get_timeseries_values(ts_id=None, timeseriesgroup_id=None, period=None, start=None, end=None, **kwargs)[source]

Get time series data from waterinfo.be

Using the ts_id codes or group identifiers and by providing a given date period, download the corresponding time series from the waterinfo.be website. Each identifier ts_id corresponds to a given variable-location-frequency combination (e.g. precipitation, Waregem, daily). When interested in daily, monthly, yearly aggregates look for these identifiers in order to overcome too many/large requests.

Note: The usage of ‘start’ and ‘end’ instead of the API default from/to is done to avoid the usage of from, which is a protected name in Python.

Parameters:
  • ts_id (str) – single or multiple ts_id values, comma-separated

  • timeseriesgroup_id (str) – single or multiple group identifiers, comma-separated

  • period (str) – input string according to format required by waterinfo: the period string is provided as P#Y#M#DT#H#M#S, with P defines Period, each # is an integer value and the codes define the number of… Y - years M - months D - days T required if information about sub-day resolution is present H - hours D - days M - minutes S - seconds Instead of D (days), the usage of W - weeks is possible as well. Examples of valid period strings: P3D, P1Y, P1DT12H, PT6H, P1Y6M3DT4H20M30S.

  • start (datetime | str) – Either Python datetime object or a string which can be interpreted as a valid Timestamp.

  • end (datetime | str) – Either Python datetime object or a string which can be interpreted as a valid Timestamp.

  • kwargs

    Additional query parameter options as documented by KIWIS waterinfo API, see API documentation

Returns:

DataFrame with for time series data and datetime in UTC.

Return type:

pd.DataFrame

Examples

>>> from pywaterinfo import Waterinfo
>>> vmm = Waterinfo("vmm")
>>>
>>> # get last day of data for the time series with ID 78124042
>>> df = vmm.get_timeseries_values(78124042, period="P1D")
>>>
>>> # get last day data of time series with ID 78124042 with subset of columns
>>> my_columns = ("Timestamp,Value,Interpolation Type,Quality Code,Quality"
...               " Code Name,Quality Code Description")
>>> df = vmm.get_timeseries_values(78124042, period="P1D",
...                                returnfields=my_columns)
>>>
>>> # get the data for ts_id 60992042 and 60968042 (Moerbeke_P and Waregem_P)
>>> # for 20190502 till 20190503
>>> # Note: UTC as time unit is used as input and asked as output by default
>>> df = vmm.get_timeseries_values("60992042,60968042",
...                           start="20190502", end="20190503")
>>>
>>> # One can overwrite the timezone to request data in another time zone:
>>> df = vmm.get_timeseries_values("60992042,60968042",
...                           start="20190502", end="20190503", timezone="CET")
>>>
>>> # get the data for all stations from groups 192900 (yearly rain sum)
>>> # and 192895 (yearly discharge average) for the last 2 years
>>> df = vmm.get_timeseries_values(timeseriesgroup_id="192900,192895",
...                                period="P2Y")  
>>>
>>> hic = Waterinfo("hic")
>>>
>>> # get last day of data for the time series with ID 44223010
>>> df = hic.get_timeseries_values(ts_id="44223010", period="P1D")
>>>
>>> # get last day data of time series with ID 44223010 with subset of columns
>>> df = hic.get_timeseries_values(ts_id="44223010", period="P1D",
...          returnfields="Timestamp,Value,Interpolation Type,Quality Code")
>>>
>>> # get last 10 hours data from Antwerpen tij/Zeeschelde (ts_id 53995010)
>>> # containing 'Tide Number' info. Tide number is useful when requesting
>>> # tidal extremes (high water/low water, 'ts_name'=Pv.HWLW)
>>> df = hic.get_timeseries_values(ts_id="53995010", period="PT10H",
...          returnfields="Timestamp,Value,Tide Number")
request_kiwis(query: dict, headers: dict | None = None) dict[source]

http call to waterinfo.be KIWIS API

General call used to request information and data from waterinfo.be, providing error handling and json parsing. The service, type, format (json), datasource and timezone (UTC) are provided by default (but can be overridden by adding them to the query).

Whereas specific methods are provided to support the queries getTimeseriesList, getTimeseriesValues, getTimeseriesValueLayer and getGroupList; this method can be used to use the other available queries as well.

Parameters:
  • query (dict) – list of query options to be used together with the base string

  • headers (dict) – authentication header for the call

Return type:

parsed json object, full HTTP response

Examples

>>> from pywaterinfo import Waterinfo
>>> vmm = Waterinfo("vmm")
>>> # get the API info/documentation from kiwis
>>> data, res = vmm.request_kiwis({"request": "getRequestInfo"})
>>> data        
[{'Title': 'KISTERS QueryServices - Request Inform...}}}}}]
>>> res.status_code
200
>>> # get the timeseries data from last day from time series 78124042
>>> data, res = vmm.request_kiwis({"request": "getTimeseriesValues",
...                                "ts_id": "78124042",
...                                "period": "P1D"})
>>> data        
[{'ts_id': '78124042'...]]}]
>>> # get all stations starting with a P in the station_no
>>> data, res = vmm.request_kiwis({"request": "getStationList",
...                                "station_no": "P*"})
>>> data        
[['station_name'...]]

Utility functions

exception pywaterinfo.utils.SSLAdditionException[source]

Bases: Exception

Raised when the SSL custom CA addition fails

pywaterinfo.utils.add_ssl_cert(ssl_cert: str)[source]

This routine is a pragmatic solution to add a custom SSL certificate to the certifi store, which urllib needs to connect over https. Use this routine when git you are experiencing an issue with an SSL: CERTIFICATE_VERIFY_FAILED error. This should only be done once for your environment.

For more details, see also : https://stackoverflow.com/questions/27835619/urllib-and-ssl-certificate-verify-failed-error

The ssl certificate should usually be issued by your company, please contact your network administrator.

Parameters:

ssl_cert (str) – The full path/filename to the SSL certificate file to add

Examples

>>> from pywaterinfo.utils import add_ssl_cert
>>> add_ssl_cert("CA-FILE-PATH")