-
Notifications
You must be signed in to change notification settings - Fork 17
feat(storage): add block level timestamp to OLAP storage #136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| } | ||
| } | ||
| // Push DelEdge with the previous edge id (old_commit_ts passed as timestamp) | ||
| txn.push_undo(crate::common::DeltaOp::DelEdge(eid), old_commit_ts); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete_edge_in_txn 只将边的 commit_ts 更新为当前事务 ID,并写入 Undo Log,但是没有打上墓碑。get_edge_at_ts 只要 commit_ts 满足可见性条件,就会直接返回该边的数据,所以对于删除事务本身,这条边会变得可见。
| ); | ||
| } | ||
| let property_block = &mut column.blocks[block_idx]; | ||
| property_block.values[offset] = Some(prop); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里直接原地覆盖了旧属性,虽然执行了 push_undo 操作,但是 UndoBuffer 是私有的,只有事务自己知道,且只在 abort() 时被使用,由于事务提交后 UndoBuffer 会被清理,旧属性会永久丢失。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original code here directly modifies attributes without any transaction considerations. The existing rollback logic is based on the understanding of this original code. You are welcome to elaborate on your ideas for modification.
| for (col_idx, column) in self.property_columns.read().unwrap().iter().enumerate() { | ||
| if let Some(val) = column | ||
| .blocks | ||
| .get(block_idx) | ||
| .and_then(|blk| blk.values.get(offset)) | ||
| .cloned() | ||
| { | ||
| props.set_prop(col_idx, val); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
读属性时,没有任何针对时间戳的过滤或回溯,只读取了数组里当前存放的值。
| // remove it | ||
| for j in offset..block.edge_counter - 1 { | ||
| block.edges[j] = block.edges[j + 1]; | ||
| } | ||
| block.edge_counter -= 1; | ||
| block.edges[block.edge_counter] = OlapStorageEdge { | ||
| eid: 0, | ||
| label_id: NonZeroU32::new(1), | ||
| dst_id: 1, | ||
| commit_ts: Timestamp::with_ts(0), | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 这里只把
block.edges的数据通过移动来删除了,但是block对应的PropertyBlock中的属性值还在原来的位置; - 这个边创建时可能更新了
block.min_ts/max_ts,abort()时需要恢复。
| }, | ||
| ) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ai review 的,供参考:let _ = std::mem::take(&mut *edges_borrow); 这行代码直接把 edges(未压缩的活跃数据)给清空了!意图可能是“数据已经存到 compressed_edges 去了,这里就不要了”。后果:所有事务逻辑(transaction.rs)只认得 storage.edges。一旦执行压缩,storage.edges 变空,正在进行的事务去 commit() 或 abort() 时,怎么找原来的边?找不到了!事务会因为 offset >= block.edge_counter 而失效,或操作错误的地址。
| let (block_idx, offset) = *loc.value(); | ||
| let mut edges = self.storage.edges.write().unwrap(); | ||
| if let Some(block) = edges.get_mut(block_idx) | ||
| && offset < block.edge_counter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ai review 的,供参考:事务提交只认“老地方”事务提交逻辑完全没有意识到“数据可能被搬家到压缩区了”。它死板地去 self.storage.edges 里找。如果此刻压缩已经发生(数据搬家了),事务提交就会静悄悄地失败(if 条件不满足),或者更糟,因为 array index 还是原来的 offset,如果新数据又填进来了,它会修改“新搬进来的无辜数据”的状态。
| self.compressed_edges.write().unwrap().insert( | ||
| index, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ai review 的,供参考:压缩区的 CompressedEdgeBlock 的设计看起来是完全静态的(位图、压缩整数)。 即使事务想更新它,也很难在不解压的情况下修改里面的 commit_ts(通常压缩后很难原地修改)。一旦数据被压缩,它实质上进入了“只读归档”状态,但系统并没有阻止对已压缩数据的事务操作,这会导致严重的逻辑黑洞。
feat(storage): add block level timestamp to OLAP storage
Type
feat: (new feature)fix: (bug fix)docs: (doc update)refactor: (refactor code)test: (test code)chore: (other updates)Scope
query: (query engine)parser: (frontend parser)planner: (frontend planner)optimizer: (query optimizer)executor: (execution engine)op: (operators)storage: (storage engine)mvcc: (multi version concurrency control)schema: (graph model and topology)tool: (tools)cli: (cli)sdk: (sdk)none: (N/A)Description
Issue: #111
Checklist
masterbranch.