Guiding an LLM for Robust Java ByteBuffer Code

Unmesh is a Distinguished Engineer at Thoughtworks, based in Pune, India. He is the author of Patterns of Distributed Systems

This article is part of “Exploring Gen AI”. A series capturing Thoughtworks technologists' explorations of using gen ai technology for software development.

17 April 2025

AI editors like Cursor can generate code with remarkable speed using LLMs, handling boilerplate and providing functional snippets for various tasks. However, when building robust systems, functional correctness is only the starting point. Code must also be safe, predictable, maintainable, and free from subtle side effects.

This article demonstrates, through a dialogue between a developer and an LLM, how expert guidance is crucial to transform an initial, potentially unsafe code snippet into a robust, system-ready component. This development session was done as part of a larger development effort to write a BoltDB like storage engine for an internal workshop. This also demonstrates how breaking the larger development efforts into smaller programming sessions can be more effective with LLM based development. We'll use the common task of reading and writing a fixed-structure header within a Java ByteBuffer.

The Scenario: Defining a Page Header

Our goal is to create a Java class to manage a simple page header structure within a ByteBuffer. The structure includes: Page Type (byte), Page Level (byte), Number of Keys (short), Free Space Offset (int), and Right Sibling Page ID (int).

The Conversation Begins

Initial Prompt to Start the Conversation

Can you show me how to write and read a simple page header structure to a Java ByteBuffer? The header should contain: Page Type (byte), Page Level (byte), Number of Keys (short), Free Space Offset (int), Right Sibling Page ID (int).

LLM-Generated Code (Using position())

Certainly. Here's a basic example using ByteBuffer's position() method to manage reads and writes:”

import java.nio.ByteBuffer;

public class PageHeaderV1 {
    // Offsets (implicitly managed by position)
    private static final int PAGE_TYPE_OFFSET = 0;
    private static final int LEVEL_OFFSET = 1; // pageType is 1 byte
    private static final int NUM_KEYS_OFFSET = 2; // level is 1 byte
    private static final int FREE_SPACE_OFFSET_OFFSET = 4; // numKeys is 2 bytes
    private static final int RIGHT_SIBLING_OFFSET = 8; // freeSpaceOffset is 4 bytes
    public static final int HEADER_SIZE = 12; // rightSibling is 4 bytes

    public static void write(ByteBuffer buffer, byte pageType, byte level, short numKeys, int freeSpaceOffset, int rightSiblingPageId) {
        buffer.clear(); // Reset position to 0, limit to capacity

        buffer.position(PAGE_TYPE_OFFSET);
        buffer.put(pageType);

        buffer.position(LEVEL_OFFSET);
        buffer.put(level);

        buffer.position(NUM_KEYS_OFFSET);
        buffer.putShort(numKeys);

        buffer.position(FREE_SPACE_OFFSET_OFFSET);
        buffer.putInt(freeSpaceOffset);

        buffer.position(RIGHT_SIBLING_OFFSET);
        buffer.putInt(rightSiblingPageId);
    }

    public static byte getPageType(ByteBuffer buffer) {
        buffer.position(PAGE_TYPE_OFFSET);
        return buffer.get();
    }

    public static byte getLevel(ByteBuffer buffer) {
        buffer.position(LEVEL_OFFSET);
        return buffer.get();
    }

    public static short getNumKeys(ByteBuffer buffer) {
        buffer.position(NUM_KEYS_OFFSET);
        return buffer.getShort();
    }

    public static int getFreeSpaceOffset(ByteBuffer buffer) {
        buffer.position(FREE_SPACE_OFFSET_OFFSET);
        return buffer.getInt();
    }

    public static int getRightSiblingPageId(ByteBuffer buffer) {
        buffer.position(RIGHT_SIBLING_OFFSET);
        return buffer.getInt();
    }
}

This code uses position() to move to the correct offset before each read or write.

Developer Insight: The LLM produced functionally correct code for a simple case. However, the expert developer immediately recognizes a potential problem: the use of buffer.position(). This method modifies the state of the ByteBuffer object itself.

Developer Suggests Absolute Offsets

Thanks. However, constantly changing the buffer's position can cause issues, especially with shared buffers like MappedByteBuffer or in concurrent code. It introduces side effects. Can we rewrite this using the absolute get(index) and put(index, value) methods instead to make it stateless regarding the buffer's position?

Developer Insight: The developer identifies the specific issue (side effects via state mutation) and understands the context where it becomes dangerous (shared buffers, concurrency). They guide the LLM towards a safer, stateless pattern.

You are absolutely right. Using absolute offset methods avoids modifying the buffer's position state, which is much safer for shared or concurrent use cases. Here's how you can read the numKeys and freeSpaceOffset using the absolute methods:

import java.nio.ByteBuffer;

// Example correction for a couple of methods:
public class PageHeaderCorrection {
    private static final int NUM_KEYS_OFFSET = 2;
    private static final int FREE_SPACE_OFFSET_OFFSET = 4;

    // Corrected getter using absolute offset
    public static short getNumKeys(ByteBuffer buffer) {
        return buffer.getShort(NUM_KEYS_OFFSET); // Reads at index 2, doesn't change position
    }

    // Corrected getter using absolute offset
    public static int getFreeSpaceOffset(ByteBuffer buffer) {
        return buffer.getInt(FREE_SPACE_OFFSET_OFFSET); // Reads at index 4, doesn't change position
    }

    // Similarly, writing would use put(index, value):
    public static void writeNumKeys(ByteBuffer buffer, short numKeys) {
         buffer.putShort(NUM_KEYS_OFFSET, numKeys); // Writes at index 2, doesn't change position
    }
}

Using get(index) and put(index, value) ensures the operations are side-effect-free concerning the buffer's position.

Developer Insight: The LLM correctly implements the suggested pattern. The developer now pushes for better software design, moving from static utility methods to an object-oriented approach.

Developer Requests Refactoring to Instance-Based Class

Developer: “Excellent, that's exactly the pattern needed. Please provide the complete PageHeader class using only the absolute offset methods (get(index), put(index, value), getShort(index), putShort(index, value), etc.) for all fields. Also, let's refactor this. Instead of static methods, make PageHeader a class that takes the ByteBuffer in its constructor. It should act as a view or wrapper around that buffer section. Remove the static write method that takes all attributes. Instead, provide instance methods for getting and setting each field directly on the buffer provided in the constructor, still using the safe absolute offset methods.”

Developer Insight: The developer applies software design principles: Encapsulation (wrapping the buffer) and Object-Oriented Design (instance methods acting on object state – the buffer reference). This improves usability and maintainability.

LLM: “Understood. Refactoring the PageHeader into a class that wraps a ByteBuffer provided at construction is a good way to encapsulate header access for a specific page buffer. Here's the refactored class with instance methods for getting and setting fields using absolute offsets:”

import java.nio.ByteBuffer;

public class PageHeader {
    // Define constants for offsets and size
    private static final int OFFSET_PAGE_TYPE = 0;
    private static final int OFFSET_LEVEL = 1;
    private static final int OFFSET_NUM_KEYS = 2;
    private static final int OFFSET_FREE_SPACE = 4;
    private static final int OFFSET_RIGHT_SIBLING = 8;
    public static final int HEADER_SIZE = 12; // Total size of the header

    private final ByteBuffer buffer; // Store the buffer instance

    /**
     * Creates a PageHeader view wrapping the given ByteBuffer.
     * Assumes the buffer's relevant section starts at position 0
     * or that the buffer slice/view is correctly positioned.
     * @param buffer The ByteBuffer containing the page data.
     */
    public PageHeader(ByteBuffer buffer) {
        // Consider adding checks: buffer != null, buffer.capacity() >= HEADER_SIZE
        this.buffer = buffer;
    }

    // --- Getters using absolute offsets ---

    public byte getPageType() {
        return buffer.get(OFFSET_PAGE_TYPE);
    }

    public byte getLevel() {
        return buffer.get(OFFSET_LEVEL);
    }

    public short getNumKeys() {
        return buffer.getShort(OFFSET_NUM_KEYS);
    }

    public int getFreeSpaceOffset() {
        return buffer.getInt(OFFSET_FREE_SPACE);
    }

    public int getRightSiblingPageId() {
        return buffer.getInt(OFFSET_RIGHT_SIBLING);
    }

    // --- Setters using absolute offsets ---

    public void setPageType(byte pageType) {
        buffer.put(OFFSET_PAGE_TYPE, pageType);
    }

    public void setLevel(byte level) {
        buffer.put(OFFSET_LEVEL, level);
    }

    public void setNumKeys(short numKeys) {
        buffer.putShort(OFFSET_NUM_KEYS, numKeys);
    }

    public void setFreeSpaceOffset(int freeSpaceOffset) {
        buffer.putInt(OFFSET_FREE_SPACE, freeSpaceOffset);
    }

    public void setRightSiblingPageId(int rightSiblingPageId) {
        buffer.putInt(OFFSET_RIGHT_SIBLING, rightSiblingPageId);
    }

    /**
     * Initializes a new header structure with default/zero values
     * in the underlying buffer.
     */
    public void initializeNewHeader() {
        setPageType((byte) 0);
        setLevel((byte) 0);
        setNumKeys((short) 0);
        setFreeSpaceOffset(HEADER_SIZE); // Default free space starts after header
        setRightSiblingPageId(0); // Assuming 0 represents no sibling initially
        // Or use specific constants for null/invalid IDs if available
    }
}

This version encapsulates access to the header fields within an instance tied to a specific ByteBuffer. All read and write operations use absolute offsets, ensuring safety and avoiding side effects on the buffer's position. I've also added an example initializeNewHeader method for convenience.

Takeaways: The importance of the developer role

There are more improvements possible. But this dialogue highlights critical areas where developer expertise is indispensable when working with LLMs:

Identifying Hidden Risks: The initial code worked, but the developer recognized the latent danger of side effects from position() in broader system contexts.
Contextual Awareness: Understanding implications for MappedByteBuffer, concurrency, and library design was key to steering away from the stateful approach.
Guiding Towards Robust Patterns: The developer explicitly prompted for stateless, absolute offset methods, leveraging deep knowledge of safe low-level programming practices.
Applying Software Design Principles: The final refactoring wasn't just about correctness but about better design – encapsulation, instance-based logic, maintainability (constants), and usability.
Critical Evaluation: Throughout the process, the developer critically evaluated the LLM's output against not just functional requirements but also non-functional requirements like safety, stability, and maintainability.

Conclusion

LLMs are incredibly powerful coding assistants, accelerating development and handling complex tasks. However, as this case study shows, they are tools that respond to guidance. Building robust, reliable, and performant systems, requires the critical thinking, contextual understanding, and deep systems knowledge of an experienced developer. The expert doesn't just prompt for code; they evaluate, guide, refine, and integrate, ensuring the final product meets the rigorous demands of real-world software engineering.