This pattern is part of Patterns of Distributed Systems

Segmented Log

Split log into multiple smaller files instead of a single large file for easier operations.

13 August 2020

Problem

A single log file can grow and become a performance bottleneck while its read at the startup. Older logs are cleaned up periodically and doing cleanup operations on a single huge file is difficult to implement

Solution

Single log is split into multiple segments. Log files are rolled after a specified size limit.

public synchronized Long writeEntry(WALEntry entry) {
    maybeRoll();
    return openSegment.writeEntry(entry);
}

private void maybeRoll() {
    if (openSegment.
            size() >= config.getMaxLogSize()) {
        openSegment.flush();
        sortedSavedSegments.add(openSegment);
        long lastId = openSegment.getLastLogEntryIndex();
        openSegment = WALSegment.open(lastId, config.getWalDir());
    }
}

With log segmentation, there needs to be an easy way to map logical log offsets (or log sequence numbers) to the log segment files. This can be done in two ways:

  • Each log segment name is generated by some well known prefix and the base offset (or log sequence number).
  • Each log sequence number is divided into two parts, the name of the file and the transaction offset.

public static String createFileName(Long startIndex) {
    return logPrefix + "_" + startIndex + logSuffix;
}

public static Long getBaseOffsetFromFileName(String fileName) {
    String[] nameAndSuffix = fileName.split(logSuffix);
    String[] prefixAndOffset = nameAndSuffix[0].split("_");
    if (prefixAndOffset[0].equals(logPrefix))
        return Long.parseLong(prefixAndOffset[1]);

    return -1l;
}

With this information, the read operation is two steps. For a given offset (or transaction id), the log segment is identified and all the log records are read from subsequent log segments.

public synchronized List<WALEntry> readFrom(Long startIndex) {
    List<WALSegment> segments = getAllSegmentsContainingLogGreaterThan(startIndex);
    return readWalEntriesFrom(startIndex, segments);
}
private List<WALSegment> getAllSegmentsContainingLogGreaterThan(Long startIndex) {
    List<WALSegment> segments = new ArrayList<>();
    //Start from the last segment to the first segment with starting offset less than startIndex
    //This will get all the segments which have log entries more than the startIndex
    for (int i = sortedSavedSegments.size() - 1; i >= 0; i--) {
        WALSegment walSegment = sortedSavedSegments.get(i);
        segments.add(walSegment);

        if (walSegment.getBaseOffset() <= startIndex) {
            break; // break for the first segment with baseoffset less than startIndex
        }
    }

    if (openSegment.getBaseOffset() <= startIndex) {
        segments.add(openSegment);
    }

    return segments;
}

Examples

  • The log implementation in all Consensus implementations like Zookeeper and RAFT uses log segmentation.
  • The storage implementation in Kafka follows log segmentation.
  • All the databases, including the nosql databases like Cassandra use roll over strategy based on some pre configured log size.
Significant Revisions

13 August 2020: Published