Put ====== The key/value put operation flow: :: -> WALOnly DB::Put -> WriteBatch::Put -> DB::Write -> -> unorderedWrite -> pipelineWrite DBImpl::WriteWALOnly -> DBImpl::ConcurrentWriteToWAL -> DBImpl::WriteToWAL -> Writer::AddRecord DBImpl::UnorderedWrite -> WriteBatchInternal::InsertInto -> WriteBatch::Iterate -> MemTableInserter::PutCF -> MemTable::Add/Update DBImpl::PipelineWrite -> WriteThread::JoinBatchGroup -> WriteThread::EnterAsBatchGroupLeader -> DBImpl::WriteToWAL -> WriteThread::ExitAsBatchGroupLeader -> WriteThread::EnterAsMemTableWriter -> WriteThread::LaunchParallelMemTableWriters/WriteBatchInternal::InsertInto -> WriteThread::ExitAsMemTableWriter SimpleWrite WriteThread::JoinBatchGroup->WriteThread::EnterAsBatchGroupLeader -> DBImpl::WriteToWAL/DBImpl::ConcurrentWriteToWAL -> WriteBatchInternal::InsertInto/WriteThread::LaunchParallelMemTableWriters -> WriteThread::CompleteParallelMemTableWriter -> WriteThread::ExitAsBatchGroupLeader Multiple Threads Write: :: Group Thread1 Thread2 Thread3 | | | V V V Join BatchGroup | | | V V V Enter As BatchGroupLeader --> AwaitState | | V |------------ Batch Commit WAL | | | | |transform to follower V | | Launch Parallel Followers-- | | | V V Insert MemTable | V Complete Parallel Writers | V Exit As BatchGroupLeader ------> Notify Successor Leader | v Write End and Return Multiple Threads Pipeline Write: :: Group Thread1 Thread2 Thread3 | | | V V V Join BatchGroup | | | V V V Enter As BatchGroupLeader --> AwaitState | | | V |--------- | Batch Commit WAL | | | | |transform to follower V | | Exit As BatchGroupLeader --> Notify Successor | | | V V EnterAsMemTableWriter | V Insert MemTable | V ExitAsMemTableWriter | v Write End and Return The State of Writer: +----------------------------------+------------------------------------------+ | state name | note | +----------------------------------+------------------------------------------+ | STATE_INIT | Created | +----------------------------------+------------------------------------------+ | STATE_GROUP_LEADER | JoinBatchGroup, as Leader if first Join | +----------------------------------+------------------------------------------+ | STATE_MEMTABLE_WRITER_LEADER | Serial Write Leader | +----------------------------------+------------------------------------------+ | STATE_PARALLEL_MEMTABLE_WRITER | Parallel Write Leader | +----------------------------------+------------------------------------------+ | STATE_COMPLETED | Complete Write | +----------------------------------+------------------------------------------+ | STATE_LOCKED_WAITING | Waiting | +----------------------------------+------------------------------------------+ The State Transform: :: JoinBatchGroup ConcurrentWriteToWAL STATE_INIT ---------------> STATE_GROUP_LEADER ----------------------> STATE_COMPLETED | ^ | ^ | Waiting Previous |-----| | | V Writing | V | STATE_LOCKED_WAITING ----- STATE_LOCKED_WAITING STATE_LOCKED_WAITING | ^ |------> STATE_MEMTABLE_WRITER_LEADER --| |------> STATE_PARALLEL_MEMTABLE_WRITER-| The data structure transform: | 1. The (K:string, V:string) tuple user given. | 2. The WriteBatch rep as below: +--------+------------+---------------------------------------------+ | Label | Type | note | +--------+------------+---------------------------------------------+ | Seq | Byte[8] | Sequence Number | +--------+------------+---------------------------------------------+ | count | Fixed32 | KV count | +--------+------------+---------------------------------------------+ | kType | Byte | Key Type | +--------+------------+---------------------------------------------+ | kLen | Var32 | Key Length(Optional timestamp suffix Length)| +--------+------------+---------------------------------------------+ | Key | Byte[kLen] | Key(Optional timestamp suffix) | +--------+------------+---------------------------------------------+ | vLen | Var32 | Value Length | +--------+------------+---------------------------------------------+ | Value | Bytes[vLen]| Value | +--------+------------+---------------------------------------------+ | | KV... | More KV | +--------+------------+---------------------------------------------+ | 3. Merge multiple WriteBatch(Current and others in WriteThread) to one WriteBatch; | 4. Write to WAL record; | 5. Fetch Key/Value(Slice, Slice),kType from Current WriteBatch | 6. Write Current Key/Value/kType to WriteBatch for transaction(Optional); | 7. Write Current Key/Value/kType(Combine to internal Key) to MemTable The internal key in MemTable: +--------+----------------+---------------------------------------------+ | Label | Type | note | +--------+----------------+---------------------------------------------+ | kLen | Var32 | internal key Length | +--------+----------------+---------------------------------------------+ | Key | Byte[kLen-8] | The key from last flow | +--------+----------------+---------------------------------------------+ | Packed| Fixed64 | [SeqNumber[56],kType[8]] | +--------+----------------+---------------------------------------------+ | vLen | Var32 | The value Length | +--------+----------------+---------------------------------------------+ | Value | Byte[vLen] | The value | +--------+----------------+---------------------------------------------+