Conversation
|
Very interesting! I've been looking at something similar previously. May I suggest aligning the API with OGC API Publish-Subscribe Workflow (note that while the specification uses MQTT as an example, it does not prescribe any delivery mechanism so SSE or WebSockets can be used, it should mostly be a question of using the CloudEvent payload, and potentially adding a new OGC API Features layer type on which realtime can be enabled, rather than a dedicated realtime layer time)? pygeoapi has an existing implementation of the standard, against which a client could be tested. |
|
@02JanDal, interesting input. Before I wrote this implementation I did a quick survey to see if there were any standards to use before making my own format as standards would be the way to go. Being a GIS application, the first place I looked was OGC, but I thought OGC API Features pub/sub only supported MQTT and I wanted to keep the implementation dependency free and very lightweight footprint and then chose SSE with custom events as in general there seemed to be no clear standard that stood out. After I read up on OGC API again, most of the OGC API features seems to be about searchability and metadata, which in the end will give you a link to the broker where an actual stream can be found. My implementation cuts to the chase and goes straight to the SSE stream as the author of the Origo configuration would most likely know the correct connection parameters. My goal was from the beginning to implement a lightweight client which communicates with a custom server side that is easy to implement for different source data instead of making layer types for every possible type of real time source. The aim was to only make the server side a proxy to external real time data sources. As there are too many different kinds of real time sources, I thought that hiding implementation server side would make it easy to provide for any kind of source. It is a good point that a real time layer could be a OCG Feature layer with real time support, but in theory any kind of layer could potentially return a link to a stream where future updates to the initially received data will be published. In fact one of my first ideas was to extend the WFS layer with real time capabilities, but with static configuration instead of receiving a URL to the stream. WFS would been used to get the initial state, followed by a SSE stream with updates. When testing my implementation I have also identified the possible need of a "landing url" before connecting to the actual stream. In my case it is mainly because SSE does not support enough mechanisms to provide for authentication and error handling. Taking inspiration from Trafikverkets real time api: https://data.trafikverket.se/documentation/datacache/the-request such an implementation would make use of a REST POST endpoint where the initial filters and authentication is sent. The initial request would then create a session and return a link to the stream for that session using the session id as parameter. When you think about it, that is pretty much like an OGC Features api request that returns a link to a stream. So, your comment has really questioned my descicions and I'm totally uncertain where to go next. Implementing a complete OGC Features Api layer seems like a bigger task than I expected, but in the end probably useful. We actaully have an issue on creating a OCG Features layer (#1977). Adding real time capabilities to a layer is actually a pretty small task. My implementation is less than 100 relevant lines of code. Implementing the events in the OGC proposed Geojson payload format would be no big task either. It's pretty much the same format as I use although my format is not as formal. Both are basically a GeoJson wrapped in a control object with some event information . It would add a bit to the overhead though. Maybe I could have a look at implementing a very minimal OGC Features Api layer with an SSE stream using the geojson or CloudEvents payload. |
|
After looking into it more in detail, I still find the OGC API Pub/sub road interesting, but I can not see how to combine it with SSE and still maintain combability with the standards involved:
All in all, it seems to me that following the hub-link would require some knowledge on how the hub works before trying to connect when using the This has given me a lot of headache and it now leaves me with tree options:
Right now I'm leaning towards 3. The main benefit over 2 is that websockets seems more likely to conform to standards. Drawback with websockets is that I find them a bit harder to implement robustly and could potentially cause more trouble in loadbalancers and reverse proxys etc. I could have misunderstood some things, so feel free to correct me @02JanDal . |
|
Given that it's the only existing server implementation for this I'm aware of but also usually is one of the first to implement new OGC API standards, I'd probably lean towards whatever would work with pygeoapi. Annoyingly the documentation doesn't provide a lot of hints if one doesn't want to use MQTT though. It should be relatively trivial to create a simple broker that accepts SSE clients and events as sent by pygeoapi's HTTP-broker option, at least for testing (might also be an option to integrate that into origo-server). Your option 3. definitely has the upside of following the existing (draft) specifications the closest. On the other hand, SSE would be the better technical choice here as it's a unidirectional data flow. I think a decent way to tell the client that a hub speaks SSE would be to simple use the link "links": [
...
{
"rel": "hub",
"href": "https://...",
"type": "text/event-stream"
}
]Alternatively, I don't think simply deciding that Origo only speaks SSE (for now) and documenting that would be terrible either. |
|
Yes, the pygeoapi documentation is pretty vague on that point. After looking at the code changes for the PR where pub/sub was introduced it looks to me like pygeoapi does not actually implement a broker of its own. It looks like it publishes changes to a broker and in the AsyncApi you can point out where that broker is. The HTTP pub/sub configuration is implemented as a POST to the configured URL and the MQTT configuration is implemented as a publish event call to a broker. So my conclusion is that it still would be necessary to implement some sort of broker that accepts subscriptions from origo to use pygeoapi. Implementing such a broker is pretty trivial regardless of transport. If pygeoapi has a change discovery mechanism that publishes POST events to a configured broker, that would be a great start for a generic implementation as writing a broker that accepts POST:ed events and turn them into websocket/SEE events to send to subscribers (i.e Origo) is a walk in the park. For more complex scenarios a custom backend would still be useful. The following use cases would benefit from implementing both OCG Api and broker in the same server side or at least have the possibility to share sessions.
|
|
GitHub automatically closed this PR as the new changes resulted in that all previous code was nullified. Opening again with a brand new approach replacing SSE with websockets. Completely new implementation based on OCG Features Api pub/sub pattern and websocketsThis implementation adds a new layer type that implements a very basic OCG Features Api source. It only supports getting an items collection, without any filter, bbox och CRS support. Reason for this is that it only serves as a bootstrap for a real time stream. However it can easily be augmented to support more of the standard. If the response contains a hub-link to a stream it connects to the stream and receives updates. To be future proof the layer must be configured with real time support, as default is to NOT follow links. It only supports websockets as hub endpoint. It does not support retrieving AsyncApi documents to discover links to stream hubs. Since data can be of different kind with respect to current/inital state and if missed events are crucial it is possible to configure the layer to support different scenarios to reflect how the server handles initial states and reconnects.
The layer implementation supports
Considered standards
TestingAs there are no know servers that implement a websocket hub it is pretty tough to test. I have updated the reference server at https://github.com/ornskoldsvikskommun/origo-realtime-reference-api to work with this layer type, but since I have developed both it could be the same errors on both side. ConfigurationFirst define a source: "source": {
"realtime-ref": {
"url": "http://localhost:3004/featuresapi/",
"projection": "EPSG:3006"
},Then a layer: {
"name": "realtime linjelager ",
"id": "linjelager",
"title": "RealTime Linje",
"group": "root",
"source": "realtime-ref",
"type": "FEATURESAPI",
"realtimeReconnect": "full", // one of 'none', 'full', 'stream', default 'full'
"realtime": true, // Enable real time updates
"realtimeDisconnectOnHide": true, // Disconnect when layer is hidden and reconnect in visible. Default true
"editable": false,
"attributes": [
{
"title": "Fritext",
"name": "fritext",
"type": "text"
}
],
"geometryName": "the_geom",
"geometryType": "LineString",
"visible": true
},The
|
Resolves #2262.
Creating as draft as we are not ready with the backend, which may seriously change the requirements on the layer type. But it is working for you all to enjoy during the Christmas holiday.
In order to test it you will need a backend sending SSE events. The event should be a named event
updatefor updates and inserts, they are treated equally like PUT. The payload should be exactly one feature encoded as GeoJson. For deletions adeleteevent should be send with an object with one property calledidcontaining the id (as previously sent as id in GeoJson) to delete. To connect to the server the server should accept a query-argument namedlayer.There is no fetching of an initial state, so the layer will be empty until events arrive. But the server may send a series of events on connect. Nothing special about them, but it will emulate an initial state. I have also considered adding support for fetching an initial state through a WFS-endpoint, but since the rest of the code does not use WFS and the source may not be backed up by a database I thought it is easier if the real-time server provides it as ordinary events. For large datasets it could possibly become slow, but by sending the events individually instead of one giant combined initial event it will not reach buffer limits.
I have implemented a reference implementation of a server to demonstrate the functionality without having to create a backend: https://github.com/ornskoldsvikskommun/origo-realtime-reference-api It is pretty simple and uses a PostGis database with triggers as source.
The implementation uses Server Side Events (SSE) as specified by the
EventSourcespecification. It has some serious limitations when using HTTP/1.1 as it will for each layer hog one of the six available connections for each remote host. Normally this is not a problem as most server sides can be written to support HTTP/2, either natively or by putting a HTTP/2 capable reverse proxy (HAProxy, Nginx, IIS or whatever) in front of it. The only situation where it can't be avoided is when using IIS with Windows Integrated Authentication. It will work, but connection will be downgraded to HTTP/1.1. Some other limitations is that it is not possible to send custom headers and the connection can only be instantiated with a GET-request.Features
Configuration
Future Improvements
Here are some ideas. Some may be added to this PR, some will have to wait for someone to need it.