diff --git a/.trunk/trunk.yaml b/.trunk/trunk.yaml
index a6973a49..db0d5e9b 100644
--- a/.trunk/trunk.yaml
+++ b/.trunk/trunk.yaml
@@ -2,7 +2,7 @@
version: 0.1
cli:
- version: 1.22.9
+ version: 1.22.10
plugins:
sources:
@@ -18,20 +18,20 @@ runtimes:
lint:
enabled:
- - renovate@39.149.0
+ - renovate@39.164.1
- golangci-lint@1.63.4
- - vale@3.9.4
+ - vale@3.9.5
- actionlint@1.7.7
- - checkov@3.2.360
+ - checkov@3.2.369
- git-diff-check
- markdownlint@0.44.0
- oxipng@9.1.3
- - prettier@3.4.2:
+ - prettier@3.5.0:
packages:
- "@mintlify/prettier-config@1.0.6"
- svgo@3.3.2
- - trivy@0.59.0
- - trufflehog@3.88.4
+ - trivy@0.59.1
+ - trufflehog@3.88.5
- yamllint@1.35.1
ignore:
- linters: [ALL]
diff --git a/README.md b/README.md
index f23310ef..03a35d8a 100644
--- a/README.md
+++ b/README.md
@@ -10,12 +10,12 @@ documentation for the open source projects that Hypermode leads:
- [Modus](https://github.com/hypermodeinc/modus) - serverless framework for
building functions and APIs, powered by WebAssembly
-- [Dgraph](https://github.com/dgraph-io/dgraph) - distributed, transactional
+- [Dgraph](https://github.com/hypermodeinc/dgraph) - distributed, transactional
graph database for real-time use cases
-- [Badger](https://github.com/dgraph-io/badger) - embeddable key-value store in
- Go
-- [Ristretto](https://github.com/dgraph-io/ristretto) - embeddable memory-bound
- cache in Go
+- [Badger](https://github.com/hypermodeinc/badger) - embeddable key-value store
+ in Go
+- [Ristretto](https://github.com/hypermodeinc/ristretto) - embeddable
+ memory-bound cache in Go
## Found an issue?
diff --git a/badger/design.mdx b/badger/design.mdx
new file mode 100644
index 00000000..42264a2d
--- /dev/null
+++ b/badger/design.mdx
@@ -0,0 +1,62 @@
+---
+title: Design
+description: Architected for fast key-value storage in Go
+"og:title": "Design - Badger"
+---
+
+We wrote Badger with these design goals in mind:
+
+- Write a key-value database in pure Go
+- Use latest research to build the fastest KV database for data sets spanning
+ terabytes
+- Optimize for modern storage devices
+
+Badger’s design is based on a paper titled
+[WiscKey: Separating Keys from Values in SSD-conscious Storage](https://www.usenix.org/system/files/conference/fast16/fast16-papers-lu.pdf).
+
+## References
+
+The following blog posts are a great starting point for learning more about
+Badger and the underlying design principles:
+
+- [Introducing Badger: A fast key-value store written natively in Go](https://dgraph.io/blog/post/badger/)
+- [Make Badger crash resilient with ALICE](https://dgraph.io/blog/post/alice/)
+- [Badger vs LMDB vs BoltDB: Benchmarking key-value databases in Go](https://dgraph.io/blog/post/badger-lmdb-boltdb/)
+- [Concurrent ACID Transactions in Badger](https://dgraph.io/blog/post/badger-txn/)
+
+## Comparisons
+
+| Feature | Badger | RocksDB | BoltDB |
+| ----------------------------- | -------------------------------------- | ---------------------------- | ------- |
+| Design | LSM tree with value log | LSM tree only | B+ tree |
+| High Read throughput | Yes | No | Yes |
+| High Write throughput | Yes | Yes | No |
+| Designed for SSDs | Yes (with latest research1) | Not specifically2 | No |
+| Embeddable | Yes | Yes | Yes |
+| Sorted KV access | Yes | Yes | Yes |
+| Pure Go (no Cgo) | Yes | No | Yes |
+| Transactions | Yes | Yes | Yes |
+| ACID-compliant | Yes, concurrent with SSI3 | No | Yes |
+| Snapshots | Yes | Yes | Yes |
+| TTL support | Yes | Yes | No |
+| 3D access (key-value-version) | Yes4 | No | No |
+
+1 The WiscKey paper (on which Badger is based) saw big wins with
+separating values from keys, significantly reducing the write amplification
+compared to a typical LSM tree.
+
+2 RocksDB is an SSD-optimized version of LevelDB, which was designed
+specifically for rotating disks. As such RocksDB's design isn't aimed at SSDs.
+
+3 SSI: Serializable Snapshot Isolation. For more details, see the
+blog post [Concurrent ACID Transactions in
+Badger](https://dgraph.io/blog/post/badger-txn/)
+
+4 Badger provides direct access to value versions via its Iterator
+API. Users can also specify how many versions to keep per key via Options.
+
+## Benchmarks
+
+We've run comprehensive benchmarks against RocksDB, BoltDB, and LMDB. The
+benchmarking code with detailed logs are in the
+[badger-bench](https://github.com/dgraph-io/badger-bench) repo.
diff --git a/badger/overview.mdx b/badger/overview.mdx
new file mode 100644
index 00000000..c6343f92
--- /dev/null
+++ b/badger/overview.mdx
@@ -0,0 +1,19 @@
+---
+title: Overview
+description: Welcome to the Badger docs!
+mode: "wide"
+"og:title": "Overview - Badger"
+---
+
+## What is Badger? {/* */}
+
+BadgerDB is an embeddable, persistent, and fast key-value (KV) database written
+in pure Go. It's the underlying database for [Dgraph](https://dgraph.io), a
+fast, distributed graph database. It's meant to be an efficient alternative to
+non-Go-based key-value stores like RocksDB.
+
+## Changelog
+
+We keep the
+[repo Changelog](https://github.com/hypermodeinc/badger/blob/main/CHANGELOG.md)
+up to date with each release.
diff --git a/badger/quickstart.mdx b/badger/quickstart.mdx
new file mode 100644
index 00000000..f2b7622d
--- /dev/null
+++ b/badger/quickstart.mdx
@@ -0,0 +1,729 @@
+---
+title: Quickstart
+description: Everything you need to get started with Badger
+"og:title": "Quickstart - Badger"
+---
+
+## Prerequisites
+
+- [Go](https://go.dev/doc/install) - v1.23 or higher
+- Text editor - we recommend [VS Code](https://code.visualstudio.com/)
+- Terminal - access Badger through a command-line interface (CLI)
+
+## Installing
+
+To start using Badger, run the following command to retrieve the library.
+
+```sh
+go get github.com/dgraph-io/badger/v4
+```
+
+Then, install the Badger command line utility into your `$GOBIN` path.
+
+```sh
+go install github.com/dgraph-io/badger/v4/badger@latest
+```
+
+## Opening a database
+
+The top-level object in Badger is a `DB`. It represents multiple files on disk
+in specific directories, which contain the data for a single database.
+
+To open your database, use the `badger.Open()` function, with the appropriate
+options. The `Dir` and `ValueDir` options are mandatory and you must specify
+them in your client. To simplify, you can set both options to the same value.
+
+
+ Badger obtains a lock on the directories. Multiple processes can't open the
+ same database at the same time.
+
+
+```go
+package main
+
+import (
+ "log"
+
+ badger "github.com/dgraph-io/badger/v4"
+)
+
+func main() {
+ // Open the Badger database located in the /tmp/badger directory.
+ // It will be created if it doesn't exist.
+ db, err := badger.Open(badger.DefaultOptions("/tmp/badger"))
+ if err != nil {
+ log.Fatal(err)
+ }
+
+ defer db.Close()
+
+ // your code here
+}
+```
+
+### In-memory/diskless mode
+
+By default, Badger ensures all data persists to disk. It also supports a pure
+in-memory mode. When Badger is running in this mode, all data remains in memory
+only. Reads and writes are much faster, but Badger loses all stored data in the
+case of a crash or close. To open badger in in-memory mode, set the `InMemory`
+option.
+
+```go
+opt := badger.DefaultOptions("").WithInMemory(true)
+```
+
+### Encryption mode
+
+If you enable encryption in Badger, you also need to set the index cache size.
+
+
+ The cache improves the performance. Otherwise, reads can be very slow with
+ encryption enabled.
+
+
+For example, to set a `100 Mb` cache:
+
+```go
+opts.IndexCache = 100 << 20 // 100 mb or some other size based on the amount of data
+```
+
+## Transactions
+
+### Read-only transactions
+
+To start a read-only transaction, you can use the `DB.View()` method:
+
+```go
+err := db.View(func(txn *badger.Txn) error {
+ // your code here
+
+ return nil
+})
+```
+
+You can't perform any writes or deletes within this transaction. Badger ensures
+that you get a consistent view of the database within this closure. Any writes
+that happen elsewhere after the transaction has started aren't seen by calls
+made within the closure.
+
+### Read-write transactions
+
+To start a read-write transaction, you can use the `DB.Update()` method:
+
+```go
+err := db.Update(func(txn *badger.Txn) error {
+ // Your code here…
+ return nil
+})
+```
+
+Badger allows all database operations inside a read-write transaction.
+
+Always check the returned error value. If you return an error within your
+closure it's passed through.
+
+An `ErrConflict` error is reported in case of a conflict. Depending on the state
+of your app, you have the option to retry the operation if you receive this
+error.
+
+An `ErrTxnTooBig` is reported in case the number of pending writes/deletes in
+the transaction exceeds a certain limit. In that case, it's best to commit the
+transaction and start a new transaction immediately. Here is an example (we
+aren't checking for errors in some places for simplicity):
+
+```go
+updates := make(map[string]string)
+txn := db.NewTransaction(true)
+for k,v := range updates {
+ if err := txn.Set([]byte(k),[]byte(v)); err == badger.ErrTxnTooBig {
+ _ = txn.Commit()
+ txn = db.NewTransaction(true)
+ _ = txn.Set([]byte(k),[]byte(v))
+ }
+}
+_ = txn.Commit()
+```
+
+### Managing transactions manually
+
+The `DB.View()` and `DB.Update()` methods are wrappers around the
+`DB.NewTransaction()` and `Txn.Commit()` methods (or `Txn.Discard()` in case of
+read-only transactions). These helper methods start the transaction, execute a
+function, and then safely discard your transaction if an error is returned. This
+is the recommended way to use Badger transactions.
+
+However, sometimes you may want to manually create and commit your transactions.
+You can use the `DB.NewTransaction()` function directly, which takes in a
+boolean argument to specify whether a read-write transaction is required. For
+read-write transactions, it's necessary to call `Txn.Commit()` to ensure the
+transaction is committed. For read-only transactions, calling `Txn.Discard()` is
+sufficient. `Txn.Commit()` also calls `Txn.Discard()` internally to cleanup the
+transaction, so just calling `Txn.Commit()` is sufficient for read-write
+transaction. However, if your code doesn’t call `Txn.Commit()` for some reason
+(for e.g it returns prematurely with an error), then please make sure you call
+`Txn.Discard()` in a `defer` block. Refer to the code below.
+
+```go
+// Start a writable transaction.
+txn := db.NewTransaction(true)
+defer txn.Discard()
+
+// Use the transaction...
+err := txn.Set([]byte("answer"), []byte("42"))
+if err != nil {
+ return err
+}
+
+// Commit the transaction and check for error.
+if err := txn.Commit(); err != nil {
+ return err
+}
+```
+
+The first argument to `DB.NewTransaction()` is a boolean stating if the
+transaction should be writable.
+
+Badger allows an optional callback to the `Txn.Commit()` method. Normally, the
+callback can be set to `nil`, and the method will return after all the writes
+have succeeded. However, if this callback is provided, the `Txn.Commit()` method
+returns as soon as it has checked for any conflicts. The actual writing to the
+disk happens asynchronously, and the callback is invoked once the writing has
+finished, or an error has occurred. This can improve the throughput of the app
+in some cases. But it also means that a transaction isn't durable until the
+callback has been invoked with a `nil` error value.
+
+## Using key/value pairs
+
+To save a key/value pair, use the `Txn.Set()` method:
+
+```go
+err := db.Update(func(txn *badger.Txn) error {
+ err := txn.Set([]byte("answer"), []byte("42"))
+ return err
+})
+```
+
+Key/Value pair can also be saved by first creating `Entry`, then setting this
+`Entry` using `Txn.SetEntry()`. `Entry` also exposes methods to set properties
+on it.
+
+```go
+err := db.Update(func(txn *badger.Txn) error {
+ e := badger.NewEntry([]byte("answer"), []byte("42"))
+ err := txn.SetEntry(e)
+ return err
+})
+```
+
+This sets the value of the `"answer"` key to `"42"`. To retrieve this value, we
+can use the `Txn.Get()` method:
+
+```go
+err := db.View(func(txn *badger.Txn) error {
+ item, err := txn.Get([]byte("answer"))
+ handle(err)
+
+ var valNot, valCopy []byte
+ err := item.Value(func(val []byte) error {
+ // This func with val would only be called if item.Value encounters no error.
+
+ // Accessing val here is valid.
+ fmt.Printf("The answer is: %s\n", val)
+
+ // Copying or parsing val is valid.
+ valCopy = append([]byte{}, val...)
+
+ // Assigning val slice to another variable is NOT OK.
+ valNot = val // Do not do this.
+ return nil
+ })
+ handle(err)
+
+ // DO NOT access val here. It is the most common cause of bugs.
+ fmt.Printf("NEVER do this. %s\n", valNot)
+
+ // You must copy it to use it outside item.Value(...).
+ fmt.Printf("The answer is: %s\n", valCopy)
+
+ // Alternatively, you could also use item.ValueCopy().
+ valCopy, err = item.ValueCopy(nil)
+ handle(err)
+ fmt.Printf("The answer is: %s\n", valCopy)
+
+ return nil
+})
+```
+
+`Txn.Get()` returns `ErrKeyNotFound` if the value isn't found.
+
+Please note that values returned from `Get()` are only valid while the
+transaction is open. If you need to use a value outside of the transaction then
+you must use `copy()` to copy it to another byte slice.
+
+Use the `Txn.Delete()` method to delete a key.
+
+## Monotonically increasing integers
+
+To get unique monotonically increasing integers with strong durability, you can
+use the `DB.GetSequence` method. This method returns a `Sequence` object, which
+is thread-safe and can be used concurrently via various goroutines.
+
+Badger would lease a range of integers to hand out from memory, with the
+bandwidth provided to `DB.GetSequence`. The frequency at which disk writes are
+done is determined by this lease bandwidth and the frequency of `Next`
+invocations. Setting a bandwidth too low would do more disk writes, setting it
+too high would result in wasted integers if Badger is closed or crashes. To
+avoid wasted integers, call `Release` before closing Badger.
+
+```go
+seq, err := db.GetSequence(key, 1000)
+defer seq.Release()
+for {
+ num, err := seq.Next()
+}
+```
+
+## Merge operations
+
+Badger provides support for ordered merge operations. You can define a func of
+type `MergeFunc` which takes in an existing value, and a value to be _merged_
+with it. It returns a new value which is the result of the _merge_ operation.
+All values are specified in byte arrays. For e.g., here is a merge function
+(`add`) which appends a `[]byte` value to an existing `[]byte` value.
+
+```go
+// Merge function to append one byte slice to another
+func add(originalValue, newValue []byte) []byte {
+ return append(originalValue, newValue...)
+}
+```
+
+This function can then be passed to the `DB.GetMergeOperator()` method, along
+with a key, and a duration value. The duration specifies how often the merge
+function is run on values that have been added using the `MergeOperator.Add()`
+method.
+
+`MergeOperator.Get()` method can be used to retrieve the cumulative value of the
+key associated with the merge operation.
+
+```go
+key := []byte("merge")
+
+m := db.GetMergeOperator(key, add, 200*time.Millisecond)
+defer m.Stop()
+
+m.Add([]byte("A"))
+m.Add([]byte("B"))
+m.Add([]byte("C"))
+
+res, _ := m.Get() // res should have value ABC encoded
+```
+
+Example: merge operator which increments a counter
+
+```go
+func uint64ToBytes(i uint64) []byte {
+ var buf [8]byte
+ binary.BigEndian.PutUint64(buf[:], i)
+ return buf[:]
+}
+
+func bytesToUint64(b []byte) uint64 {
+ return binary.BigEndian.Uint64(b)
+}
+
+// Merge function to add two uint64 numbers
+func add(existing, new []byte) []byte {
+ return uint64ToBytes(bytesToUint64(existing) + bytesToUint64(new))
+}
+```
+
+It can be used as
+
+```go
+key := []byte("merge")
+
+m := db.GetMergeOperator(key, add, 200*time.Millisecond)
+defer m.Stop()
+
+m.Add(uint64ToBytes(1))
+m.Add(uint64ToBytes(2))
+m.Add(uint64ToBytes(3))
+
+res, _ := m.Get() // res should have value 6 encoded
+```
+
+## Setting time to live (TTL) and user metadata on keys
+
+Badger allows setting an optional Time to Live (TTL) value on keys. Once the TTL
+has elapsed, the key is no longer retrievable and is eligible for garbage
+collection. A TTL can be set as a `time.Duration` value using the
+`Entry.WithTTL()` and `Txn.SetEntry()` API methods.
+
+```go
+err := db.Update(func(txn *badger.Txn) error {
+ e := badger.NewEntry([]byte("answer"), []byte("42")).WithTTL(time.Hour)
+ err := txn.SetEntry(e)
+ return err
+})
+```
+
+An optional user metadata value can be set on each key. A user metadata value is
+represented by a single byte. It can be used to set certain bits along with the
+key to aid in interpreting or decoding the key-value pair. User metadata can be
+set using `Entry.WithMeta()` and `Txn.SetEntry()` API methods.
+
+```go
+err := db.Update(func(txn *badger.Txn) error {
+ e := badger.NewEntry([]byte("answer"), []byte("42")).WithMeta(byte(1))
+ err := txn.SetEntry(e)
+ return err
+})
+```
+
+`Entry` APIs can be used to add the user metadata and TTL for same key. This
+`Entry` then can be set using `Txn.SetEntry()`.
+
+```go
+err := db.Update(func(txn *badger.Txn) error {
+ e := badger.NewEntry([]byte("answer"), []byte("42")).WithMeta(byte(1)).WithTTL(time.Hour)
+ err := txn.SetEntry(e)
+ return err
+})
+```
+
+## Iterating over keys
+
+To iterate over keys, we can use an `Iterator`, which can be obtained using the
+`Txn.NewIterator()` method. Iteration happens in byte-wise lexicographical
+sorting order.
+
+```go
+err := db.View(func(txn *badger.Txn) error {
+ opts := badger.DefaultIteratorOptions
+ opts.PrefetchSize = 10
+ it := txn.NewIterator(opts)
+ defer it.Close()
+ for it.Rewind(); it.Valid(); it.Next() {
+ item := it.Item()
+ k := item.Key()
+ err := item.Value(func(v []byte) error {
+ fmt.Printf("key=%s, value=%s\n", k, v)
+ return nil
+ })
+ if err != nil {
+ return err
+ }
+ }
+ return nil
+})
+```
+
+The iterator allows you to move to a specific point in the list of keys and move
+forward or backward through the keys one at a time.
+
+By default, Badger prefetches the values of the next 100 items. You can adjust
+that with the `IteratorOptions.PrefetchSize` field. However, setting it to a
+value higher than `GOMAXPROCS` (which we recommend to be 128 or higher)
+shouldn’t give any additional benefits. You can also turn off the fetching of
+values altogether. See section below on key-only iteration.
+
+### Prefix scans
+
+To iterate over a key prefix, you can combine `Seek()` and `ValidForPrefix()`:
+
+```go
+db.View(func(txn *badger.Txn) error {
+ it := txn.NewIterator(badger.DefaultIteratorOptions)
+ defer it.Close()
+ prefix := []byte("1234")
+ for it.Seek(prefix); it.ValidForPrefix(prefix); it.Next() {
+ item := it.Item()
+ k := item.Key()
+ err := item.Value(func(v []byte) error {
+ fmt.Printf("key=%s, value=%s\n", k, v)
+ return nil
+ })
+ if err != nil {
+ return err
+ }
+ }
+ return nil
+})
+```
+
+### Possible pagination implementation using Prefix scans
+
+Considering that iteration happens in **byte-wise lexicographical sorting**
+order, it's possible to create a sorting-sensitive key. For example, a simple
+blog post key might look like:`feed:userUuid:timestamp:postUuid`. Here, the
+`timestamp` part of the key is treated as an attribute, and items will be stored
+in the corresponding order:
+
+| Order ASC | Key |
+| :-------: | :------------------------------------------------------------ |
+| 1 | feed:tQpnEDVRoCxTFQDvyQEzdo:1733127889:tQpnEDVRoCxTFQDvyQEzdo |
+| 2 | feed:tQpnEDVRoCxTFQDvyQEzdo:1733127533:1Mryrou1xoekEaxzrFiHwL |
+| 3 | feed:tQpnEDVRoCxTFQDvyQEzdo:1733127486:pprRrNL2WP4yfVXsSNBSx6 |
+
+It's important to properly configure keys for lexicographical sorting to avoid
+incorrect ordering.
+
+A **prefix scan** through the preceding keys can be achieved using the prefix
+`feed:tQpnEDVRoCxTFQDvyQEzdo`. All matching keys are returned, sorted by
+`timestamp`.
+Sorting can be done in ascending or descending order based on `timestamp` or
+`reversed timestamp` as needed:
+
+```go
+reversedTimestamp := math.MaxInt64-time.Now().Unix()
+```
+
+This makes it possible to implement simple pagination by using a limit for the
+number of keys and a cursor (the last key from the previous iteration) to
+identify where to resume.
+
+```go
+// startCursor may look like 'feed:tQpnEDVRoCxTFQDvyQEzdo:1733127486'.
+// A prefix scan with this cursor will locate the specific key where
+// the previous iteration stopped.
+err = db.badger.View(func(txn *badger.Txn) error {
+ it := txn.NewIterator(opts)
+ defer it.Close()
+
+ // Prefix example 'feed:tQpnEDVRoCxTFQDvyQEzdo'
+ // if no cursor provided prefix scan starts from the beginning
+ p := prefix
+ if startCursor != nil {
+ p = startCursor
+ }
+ iterNum := 0 // Tracks the number of iterations to enforce the limit.
+ for it.Seek(p); it.ValidForPrefix(p); it.Next() {
+ // The method it.ValidForPrefix ensures that iteration continues
+ // as long as keys match the prefix.
+ // For example, if p = 'feed:tQpnEDVRoCxTFQDvyQEzdo:1733127486',
+ // it matches keys like
+ // 'feed:tQpnEDVRoCxTFQDvyQEzdo:1733127889:pprRrNL2WP4yfVXsSNBSx6'.
+
+ // Once the starting point for iteration is found, revert the prefix
+ // back to 'feed:tQpnEDVRoCxTFQDvyQEzdo' to continue iterating sequentially.
+ // Otherwise, iteration would stop after a single prefix-key match.
+ p = prefix
+
+ item := it.Item()
+ key := string(item.Key())
+
+ if iterNum > limit { // Limit reached.
+ nextCursor = key // Save the next cursor for future iterations.
+ return nil
+ }
+ iterNum++ // Increment iteration count.
+
+ err := item.Value(func(v []byte) error {
+ fmt.Printf("key=%s, value=%s\n", k, v)
+ return nil
+ })
+ if err != nil {
+ return err
+ }
+ }
+ // If the number of iterations is less than the limit,
+ // it means there are no more items for the prefix.
+ if iterNum < limit {
+ nextCursor = ""
+ }
+ return nil
+ })
+return nextCursor, err
+```
+
+### Key-only iteration
+
+Badger supports a unique mode of iteration called _key-only_ iteration. It's
+several order of magnitudes faster than regular iteration, because it involves
+access to the LSM-tree only, which is usually resident entirely in RAM. To
+enable key-only iteration, you need to set the `IteratorOptions.PrefetchValues`
+field to `false`. This can also be used to do sparse reads for selected keys
+during an iteration, by calling `item.Value()` only when required.
+
+```go
+err := db.View(func(txn *badger.Txn) error {
+ opts := badger.DefaultIteratorOptions
+ opts.PrefetchValues = false
+ it := txn.NewIterator(opts)
+ defer it.Close()
+ for it.Rewind(); it.Valid(); it.Next() {
+ item := it.Item()
+ k := item.Key()
+ fmt.Printf("key=%s\n", k)
+ }
+ return nil
+})
+```
+
+## Stream
+
+Badger provides a Stream framework, which concurrently iterates over all or a
+portion of the DB, converting data into custom key-values, and streams it out
+serially to be sent over network, written to disk, or even written back to
+Badger. This is a lot faster way to iterate over Badger than using a single
+Iterator. Stream supports Badger in both managed and normal mode.
+
+Stream uses the natural boundaries created by SSTables within the LSM tree, to
+quickly generate key ranges. Each goroutine then picks a range and runs an
+iterator to iterate over it. Each iterator iterates over all versions of values
+and is created from the same transaction, thus working over a snapshot of the
+DB. Every time a new key is encountered, it calls `ChooseKey(item)`, followed by
+`KeyToList(key, itr)`. This allows a user to select or reject that key, and if
+selected, convert the value versions into custom key-values. The goroutine
+batches up 4 MB worth of key-values, before sending it over to a channel.
+Another goroutine further batches up data from this channel using _smart
+batching_ algorithm and calls `Send` serially.
+
+This framework is designed for high throughput key-value iteration, spreading
+the work of iteration across many goroutines. `DB.Backup` uses this framework to
+provide full and incremental backups quickly. Dgraph is a heavy user of this
+framework. In fact, this framework was developed and used within Dgraph, before
+getting ported over to Badger.
+
+```go
+stream := db.NewStream()
+// db.NewStreamAt(readTs) for managed mode.
+
+// -- Optional settings
+stream.NumGo = 16 // Set number of goroutines to use for iteration.
+stream.Prefix = []byte("some-prefix") // Leave nil for iteration over the whole DB.
+stream.LogPrefix = "Badger.Streaming" // For identifying stream logs. Outputs to Logger.
+
+// ChooseKey is called concurrently for every key. If left nil, assumes true by default.
+stream.ChooseKey = func(item *badger.Item) bool {
+ return bytes.HasSuffix(item.Key(), []byte("er"))
+}
+
+// KeyToList is called concurrently for chosen keys. This can be used to convert
+// Badger data into custom key-values. If nil, uses stream.ToList, a default
+// implementation, which picks all valid key-values.
+stream.KeyToList = nil
+
+// -- End of optional settings.
+
+// Send is called serially, while Stream.Orchestrate is running.
+stream.Send = func(list *pb.KVList) error {
+ return proto.MarshalText(w, list) // Write to w.
+}
+
+// Run the stream
+if err := stream.Orchestrate(context.Background()); err != nil {
+ return err
+}
+// Done.
+```
+
+## Garbage collection
+
+Badger values need to be garbage collected, because of two reasons:
+
+- Badger keeps values separately from the LSM tree. This means that the
+ compaction operations that clean up the LSM tree do not touch the values at
+ all. Values need to be cleaned up separately.
+
+- Concurrent read/write transactions could leave behind multiple values for a
+ single key, because they're stored with different versions. These could
+ accumulate, and take up unneeded space beyond the time these older versions
+ are needed.
+
+Badger relies on the client to perform garbage collection at a time of their
+choosing. It provides the following method, which can be invoked at an
+appropriate time:
+
+- `DB.RunValueLogGC()`: This method is designed to do garbage collection while
+ Badger is online. Along with randomly picking a file, it uses statistics
+ generated by the LSM-tree compactions to pick files that are likely to lead to
+ maximum space reclamation. It's recommended to be called during periods of low
+ activity in your system, or periodically. One call would only result in
+ removal of at max one log file. As an optimization, you could also immediately
+ re-run it whenever it returns nil error (indicating a successful value log
+ GC), as shown below.
+
+ ```go
+ ticker := time.NewTicker(5 * time.Minute)
+ defer ticker.Stop()
+ for range ticker.C {
+ again:
+ err := db.RunValueLogGC(0.7)
+ if err == nil {
+ goto again
+ }
+ }
+ ```
+
+- `DB.PurgeOlderVersions()`: This method is **DEPRECATED** since v1.5.0. Now,
+ Badger's LSM tree automatically discards older/invalid versions of keys.
+
+
+ The `RunValueLogGC` method would not garbage collect the latest value log.
+
+
+## Database backup
+
+There are two public API methods `DB.Backup()` and `DB.Load()` which can be used
+to do online backups and restores. Badger v0.9 provides a CLI tool `badger`,
+which can do offline backup/restore. Make sure you have `$GOPATH/bin` in your
+PATH to use this tool.
+
+The command below creates a version-agnostic backup of the database, to a file
+`badger.bak` in the current working directory
+
+```sh
+badger backup --dir
+```
+
+To restore `badger.bak` in the current working directory to a new database:
+
+```sh
+badger restore --dir
+```
+
+See `badger --help` for more details.
+
+If you have a Badger database that was created using v0.8 (or below), you can
+use the `badger_backup` tool provided in v0.8.1, and then restore it using the
+preceding command to upgrade your database to work with the latest version.
+
+```sh
+badger_backup --dir --backup-file badger.bak
+```
+
+We recommend all users to use the `Backup` and `Restore` APIs and tools.
+However, Badger is also rsync-friendly because all files are immutable, barring
+the latest value log which is append-only. So, rsync can be used as rudimentary
+way to perform a backup. In the following script, we repeat rsync to ensure that
+the LSM tree remains consistent with the MANIFEST file while doing a full
+backup.
+
+```sh
+#!/bin/bash
+set -o history
+set -o histexpand
+# Makes a complete copy of a Badger database directory.
+# Repeat rsync if the MANIFEST and SSTables are updated.
+rsync -avz --delete db/ dst
+while !! | grep -q "(MANIFEST\|\.sst)$"; do :; done
+```
+
+## Memory usage
+
+Badger's memory usage can be managed by tweaking several options available in
+the `Options` struct that's passed in when opening the database using `DB.Open`.
+
+- Number of memtables (`Options.NumMemtables`)
+ - If you modify `Options.NumMemtables`, also adjust
+ `Options.NumLevelZeroTables` and `Options.NumLevelZeroTablesStall`
+ accordingly.
+- Number of concurrent compactions (`Options.NumCompactors`)
+- Size of table (`Options.BaseTableSize`)
+- Size of value log file (`Options.ValueLogFileSize`)
+
+If you want to decrease the memory usage of Badger instance, tweak these options
+(ideally one at a time) until you achieve the desired memory usage.
diff --git a/badger/troubleshooting.mdx b/badger/troubleshooting.mdx
new file mode 100644
index 00000000..554e4ebb
--- /dev/null
+++ b/badger/troubleshooting.mdx
@@ -0,0 +1,167 @@
+---
+title: Troubleshooting
+description: "Common issues and solutions with Badger"
+"og:title": "Troubleshooting - Badger"
+---
+
+## Writes are getting stuck
+
+**Update: with the new `Value(func(v []byte))` API, this deadlock can no longer
+happen.**
+
+The following is true for users on Badger v1.x.
+
+This can happen if a long running iteration with `Prefetch` is set to false, but
+an `Item::Value` call is made internally in the loop. That causes Badger to
+acquire read locks over the value log files to avoid value log GC removing the
+file from underneath. As a side effect, this also blocks a new value log GC file
+from being created, when the value log file boundary is hit.
+
+Please see GitHub issues
+[#293](https://github.com/hypermodeinc/badger/issues/293) and
+[#315](https://github.com/hypermodeinc/badger/issues/315).
+
+There are multiple workarounds during iteration:
+
+1. Use `Item::ValueCopy` instead of `Item::Value` when retrieving value.
+1. Set `Prefetch` to true. Badger would then copy over the value and release the
+ file lock immediately.
+1. When `Prefetch` is false, don't call `Item::Value` and do a pure key-only
+ iteration. This might be useful if you just want to delete a lot of keys.
+1. Do the writes in a separate transaction after the reads.
+
+## Writes are really slow
+
+Are you creating a new transaction for every single key update, and waiting for
+it to `Commit` fully before creating a new one? This leads to very low
+throughput.
+
+We've created `WriteBatch` API which provides a way to batch up many updates
+into a single transaction and `Commit` that transaction using callbacks to avoid
+blocking. This amortizes the cost of a transaction really well, and provides the
+most efficient way to do bulk writes.
+
+```go
+wb := db.NewWriteBatch()
+defer wb.Cancel()
+
+for i := 0; i < N; i++ {
+ err := wb.Set(key(i), value(i), 0) // Will create txns as needed.
+ handle(err)
+}
+handle(wb.Flush()) // Wait for all txns to finish.
+```
+
+Note that `WriteBatch` API doesn't allow any reads. For read-modify-write
+workloads, you should be using the `Transaction` API.
+
+## I don't see any disk writes
+
+If you're using Badger with `SyncWrites=false`, then your writes might not be
+written to value log and won't get synced to disk immediately. Writes to LSM
+tree are done inmemory first, before they get compacted to disk. The compaction
+would only happen once `BaseTableSize` has been reached. So, if you're doing a
+few writes and then checking, you might not see anything on disk. Once you
+`Close` the database, you'll see these writes on disk.
+
+## Reverse iteration doesn't produce the right results
+
+Just like forward iteration goes to the first key which is equal or greater than
+the SEEK key, reverse iteration goes to the first key which is equal or lesser
+than the SEEK key. Therefore, SEEK key would not be part of the results. You can
+typically add a `0xff` byte as a suffix to the SEEK key to include it in the
+results. See the following issues:
+[#436](https://github.com/hypermodeinc/badger/issues/436) and
+[#347](https://github.com/hypermodeinc/badger/issues/347).
+
+## Which instances should I use for Badger?
+
+We recommend using instances which provide local SSD storage, without any limit
+on the maximum IOPS. In AWS, these are storage optimized instances like i3. They
+provide local SSDs which clock 100K IOPS over 4KB blocks easily.
+
+## I'm getting a closed channel error
+
+```sh
+panic: close of closed channel
+panic: send on closed channel
+```
+
+If you're seeing panics like above, this would be because you're operating on a
+closed DB. This can happen, if you call `Close()` before sending a write, or
+multiple times. You should ensure that you only call `Close()` once, and all
+your read/write operations finish before closing.
+
+## Are there any Go specific settings that I should use?
+
+We _highly_ recommend setting a high number for `GOMAXPROCS`, which allows Go to
+observe the full IOPS throughput provided by modern SSDs. In Dgraph, we have set
+it to 128. For more details,
+[see this thread](https://groups.google.com/d/topic/golang-nuts/jPb_h3TvlKE/discussion).
+
+## Are there any Linux specific settings that I should use?
+
+We recommend setting `max file descriptors` to a high number depending upon the
+expected size of your data. On Linux and Mac, you can check the file descriptor
+limit with `ulimit -n -H` for the hard limit and `ulimit -n -S` for the soft
+limit. A soft limit of `65535` is a good lower bound. You can adjust the limit
+as needed.
+
+## I see "manifest has unsupported version: X (we support Y)" error
+
+This error means you have a badger directory which was created by an older
+version of badger and you're trying to open in a newer version of badger. The
+underlying data format can change across badger versions and users have to
+migrate their data directory. Badger data can be migrated from version X of
+badger to version Y of badger by following the steps listed below. Assume you
+were on badger v1.6.0 and you wish to migrate to v2.0.0 version.
+
+1. Install badger version v1.6.0
+
+ - `cd $GOPATH/src/github.com/dgraph-io/badger`
+ - `git checkout v1.6.0`
+ - `cd badger && go install`
+
+ This should install the old badger binary in your `$GOBIN`.
+
+2. Create Backup
+ - `badger backup --dir path/to/badger/directory -f badger.backup`
+3. Install badger version v2.0.0
+
+ - `cd $GOPATH/src/github.com/dgraph-io/badger`
+ - `git checkout v2.0.0`
+ - `cd badger && go install`
+
+ This should install the new badger binary in your `$GOBIN`.
+
+4. Restore data from backup
+
+ - `badger restore --dir path/to/new/badger/directory -f badger.backup`
+
+ This creates a new directory on `path/to/new/badger/directory` and add
+ badger data in newer format to it.
+
+NOTE - The preceding steps shouldn't cause any data loss but please ensure the
+new data is valid before deleting the old badger directory.
+
+## Why do I need gcc to build badger? Does badger need Cgo?
+
+Badger doesn't directly use Cgo but it relies on https://github.com/DataDog/zstd
+library for zstd compression and the library requires
+[`gcc/cgo`](https://pkg.go.dev/cmd/cgo). You can build Badger without Cgo by
+running `CGO_ENABLED=0 go build`. This builds Badger without the support for
+ZSTD compression algorithm.
+
+As of Badger versions
+[v2.2007.4](https://github.com/hypermodeinc/badger/releases/tag/v2.2007.4) and
+[v3.2103.1](https://github.com/hypermodeinc/badger/releases/tag/v3.2103.1) the
+DataDog ZSTD library was replaced by pure Golang version and Cgo is no longer
+required. The new library is
+[backwards compatible in nearly all cases](https://discuss.dgraph.io/t/use-pure-go-zstd-implementation/8670/10):
+
+
+ Yes they're compatible both ways. The only exception is 0 bytes of input which
+ gives 0 bytes output with the Go zstd. But you already have the
+ zstd.WithZeroFrames(true) which will wrap 0 bytes in a header so it can be fed
+ to DD zstd. This is only relevant when downgrading.
+
diff --git a/community-and-support.mdx b/community-and-support.mdx
index 32ded252..59d02231 100644
--- a/community-and-support.mdx
+++ b/community-and-support.mdx
@@ -19,6 +19,7 @@ relevant GitHub repository:
- [Modus](https://github.com/hypermodeinc/modus/issues)
- [Hyp CLI](https://github.com/hypermodeinc/hyp-cli/issues)
+- [Badger](https://github.com/hypermodeinc/badger/issues)
All paid Hypermode packages include commercial support. Customers can reach out
via the Hypermode Console or through email at
diff --git a/mint.json b/mint.json
index a2af2870..febe6714 100644
--- a/mint.json
+++ b/mint.json
@@ -37,6 +37,10 @@
{
"name": "Modus",
"url": "modus"
+ },
+ {
+ "name": "Badger",
+ "url": "badger"
}
],
"redirects": [
@@ -177,6 +181,14 @@
{
"group": "Resources",
"pages": ["modus/changelog"]
+ },
+ {
+ "group": "Getting Started",
+ "pages": ["badger/overview", "badger/quickstart"]
+ },
+ {
+ "group": "Resources",
+ "pages": ["badger/troubleshooting", "badger/design"]
}
],
"feedback": {
diff --git a/styles/config/vocabularies/general/accept.txt b/styles/config/vocabularies/general/accept.txt
index fe45eb26..872936f5 100644
--- a/styles/config/vocabularies/general/accept.txt
+++ b/styles/config/vocabularies/general/accept.txt
@@ -49,6 +49,7 @@ CRUD
Debugf
Dgraph|dgraph
DQL
+embeddable
Errorf
GitHub|github
GraphiQL