This pattern is part of Patterns of Distributed Systems
Segmented Log
Split log into multiple smaller files instead of a single large file for easier operations.
13 August 2020
Problem
A single log file can grow and become a performance bottleneck while its read at the startup. Older logs are cleaned up periodically and doing cleanup operations on a single huge file is difficult to implement
Solution
Single log is split into multiple segments. Log files are rolled after a specified size limit.
public synchronized Long writeEntry(WALEntry entry) { maybeRoll(); return openSegment.writeEntry(entry); } private void maybeRoll() { if (openSegment. size() >= config.getMaxLogSize()) { openSegment.flush(); sortedSavedSegments.add(openSegment); long lastId = openSegment.getLastLogEntryIndex(); openSegment = WALSegment.open(lastId, config.getWalDir()); } }
With log segmentation, there needs to be an easy way to map logical log offsets (or log sequence numbers) to the log segment files. This can be done in two ways:
- Each log segment name is generated by some well known prefix and the base offset (or log sequence number).
- Each log sequence number is divided into two parts, the name of the file and the transaction offset.
public static String createFileName(Long startIndex) { return logPrefix + "_" + startIndex + logSuffix; } public static Long getBaseOffsetFromFileName(String fileName) { String[] nameAndSuffix = fileName.split(logSuffix); String[] prefixAndOffset = nameAndSuffix[0].split("_"); if (prefixAndOffset[0].equals(logPrefix)) return Long.parseLong(prefixAndOffset[1]); return -1l; }
With this information, the read operation is two steps. For a given offset (or transaction id), the log segment is identified and all the log records are read from subsequent log segments.
public synchronized List<WALEntry> readFrom(Long startIndex) { List<WALSegment> segments = getAllSegmentsContainingLogGreaterThan(startIndex); return readWalEntriesFrom(startIndex, segments); }
private List<WALSegment> getAllSegmentsContainingLogGreaterThan(Long startIndex) { List<WALSegment> segments = new ArrayList<>(); //Start from the last segment to the first segment with starting offset less than startIndex //This will get all the segments which have log entries more than the startIndex for (int i = sortedSavedSegments.size() - 1; i >= 0; i--) { WALSegment walSegment = sortedSavedSegments.get(i); segments.add(walSegment); if (walSegment.getBaseOffset() <= startIndex) { break; // break for the first segment with baseoffset less than startIndex } } if (openSegment.getBaseOffset() <= startIndex) { segments.add(openSegment); } return segments; }
Examples
This page is part of:
Patterns of Distributed Systems

Patterns
- Clock-Bound Wait
- Consistent Core
- Emergent Leader
- Fixed Partitions
- Follower Reads
- Generation Clock
- Gossip Dissemination
- HeartBeat
- High-Water Mark
- Hybrid Clock
- Idempotent Receiver
- Key-Range Partitions
- Lamport Clock
- Leader and Followers
- Lease
- Low-Water Mark
- Paxos
- Quorum
- Replicated Log
- Request Batch
- Request Pipeline
- Request Waiting List
- Segmented Log
- Single Socket Channel
- Singular Update Queue
- State Watch
- Two Phase Commit
- Version Vector
- Versioned Value
- Write-Ahead Log
Significant Revisions
13 August 2020: Published