forked from activitypods/mastopod
-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
This list (currently) has a primary focus on the dev-pub-timeline branch. The primary focus with that at the moment is implementing a public timeline with Redpanda/Opensearch and using that for searching posts, serving hashtag pages, and serving timelines. Current primary focus is local timeline. Once that's setup, then it's tags. After that, searching. Then, setting things up for production and testing and developing things for federation (ideally with Mastodon compatibility).
TODO:
- Write documentation for everything
- general use
- configuration
- what does what and why, and what resources are available to look into modifying
- TL;DR: make things as easy as possible for current and future devs of this project and those which may come of it, so as to make development as smooth as possible without the need to reverse engineer the code to figure out what's going on and what to do to do a specific thing and all that sorta thing
- asynchronous search for opensearch
- set up authorization in opensearch
- redpanda sqlite buffering
- post searching
- should be as simple as setting up an action in the
public-postsservice to search with Opensearch
- should be as simple as setting up an action in the
- follower fetching (redpanda thing for nodal networks)
- this would be good for smoothing out backend federation
- https://docs.redpanda.com/current/develop/consume-data/follower-fetching/
- subscribing to hashtags
- set up a Redpanda service s.t. whenever data is uploaded to the topic it notifies a Mastopod service with whatever would be sent to an Inbox normally when a post is posted and the tag (if any) so that the Mastopod service can distribute to PODs subscribed to that tag
- whenever a post is made, check for tags on the Redpanda end.
- set up a service that Redpanda can notify when a post is made to a tag
- when that notification is received, distribute to all inboxes subscribed to that tag on the local POD provider
- doing it this way allows for all POD providers using a given Redpanda instance can be notified at the same time, without needing to coordinate directly between POD providers
- using this to post to pod inboxes: ```
const { body: resource } = await ctx.call('pod-resources.get`, { resourceUri, actorUri: 'http://localhost:3000/alice'});
- Properly configure Redpanda database
- json_schema to ensure proper format of json for data integrity
- opensearch setup redpanda
- here's some resources for an elasticsearch setup. might be useful for further development with opensearch
- https://www.redpanda.com/blog/build-search-index-flink-elasticsearch
- gonna do it without flink for now, i dont think we really need it too much; it shouldn't be too hard to add in later, though, i don't think
- set things up so that posts go to Redpanda and Redpanda sends them to Opensearch, rather than saving them in a Redpanda topic, so as to prevent unnecessary redundancy
- get the endpoints working
- set up POD provider-level endpoints via a service to do things like serve tag pages
- tested
- working
- as a graph database?
- Neo4j
- properly set up the Redpanda DBMS endpoints
- two of them: one for posting, one for fetching posts
- posting
- local timeline (fetching posts)
- configure Redpanda Console
- rp-connect.yml
- integrate
sqlitebuffer
- integrate
- two of them: one for posting, one for fetching posts
- fetching using only Redpanda (no longer needed with use of Opensearch)
- fetch the first N bytes from offset X
- if num_posts < num_posts_per_page, fetch the next N bytes and take the first of those until num of posts in page is reached, in which case get the offset from the last of those and use it for getting the next page
- put them into the lists and only display the first 10 of them. then, if there are any leftover, use them to start the next list, and get the offset for the next REST call from the last one.
- wait no this wont work well bc we need to do hashtags and doing so w this would be horribly inefficient
- use
dynamicthingy for fetching hashtags
- opensearch
- basic opensearch setup
- integrate with
public-postsservice- tested and working
- set things up to be easily scalable for interconnecting nodes in a server network
- iirc opensearch, like Redpanda, should be able to do something like this. If not, set it up through Redpanda
- federated backend node interconnectivity
- Redpanda
- set things up to automate as much of the process of setting up and configuring and connecting a new node to a Redpanda network as possible
- Opensearch
- set things up to automate as much of the process of setting up and configuring and connecting a new node to a Opensearch network as possible
- ensure scalability and simplicity
- consider setting things up so that you can connect nodes with a OTP or something of the sort, alongside a password/other method(s) of authentication when initializing a new node in a/the network
- ensure redundancy in the network
- there's probably some stuff built in to Redpanda and/or Opensearch that may be very well leveraged in this use case
- Redpanda
- get a functional test post
- put together a basic page for the timeline
- implement proper access control via WAC
- do your research, do it right, security is very important, a bit of work now can prevent a lot of work later
- lower priority, todo after everything is functional:
- implement batching throughout the Redpanda pipeline
- figure out a nice and secure way to protect backend passwords in files, either by managing read permissions and hashing or otherwise
- set up HTTPS for everything
- look into streaming for optimization
- look into batching/chunking for optimization
- set up buffer (sqlite probably) for preventing loss of posts/data
- OAUTH2
- cluster balancing, follow the leader, enable rack awareness, follower fetching (efficient consumption across distributed Access Zones)
- set up a limit on how many posts per minute can be made per user
- set it up so that only authorized users are allowed to be put into the database
- set up thing to prevent tracking empty posts
- set up thing in Redpanda Connect pipeline to handle collisions when two users post at the exact same millisecond (currently that is not handled)
Important info:
- Prior to production, make sure to change the password (and username) in
docker-compose.yml(and other such things) for each of their entries throughout the document
Notes:
- What should be stored is the URIs which may be used to get the desired posts
- the URI of the user and the URI of the post in question
- perhaps what kind of ActivityStreams object as well, so as to handle more than just Activity objects or Notes (e.g. allowing videos or other such things)
- speaking of videos, consider setting something up that allows users to post a link to PeerTube or something of the like, and the link will automatically be rendered as the video itself
- check if there's an API for this. if not, put together a simple one that just downloads the video and plays it using a React module or something the like
- speaking of videos, consider setting something up that allows users to post a link to PeerTube or something of the like, and the link will automatically be rendered as the video itself
- Doesn't need to be up to any spec or anything, since it's just gonna be a list of URIs, so just do it however you want I guess
- Handling paging
- Record the time upon query to the database (time of request)
- Create the first page using the first N entries prior to the time of request
- Create the second using the next N entries prior to time of request
- and so on and so forth
- this is how I'm planning on doing paging with Opensearch, via the use of the
search_afterparameter
Metadata
Metadata
Assignees
Labels
No labels