Files
SweepStore/documentation/header.md

4.4 KiB
Raw Blame History

Sweepstore Header Structure

The Sweepstore file format uses a structured header to manage file metadata and concurrency control. The header consists of three main parts: the static header, the concurrency header, and dynamic worker tickets.

Static Header (Bytes 0-28)

The static header contains basic file information and pointers.

Offset Size Field Type Description
0 4 bytes Magic Number String File identifier, must be "SWPT"
4 12 bytes Version String Version string (UTF-8), max 11 chars (padded with spaces)
16 8 bytes Address Table Pointer int64 Pointer to the address table location
24 4 bytes Free List Count int32 Number of entries in the free list
28 1 byte Is Free List Lifted bool Flag indicating if free list is lifted (0=false, 1=true)

Total Size: 29 bytes

Concurrency Header (Bytes 29-45)

The concurrency header manages multi-threaded access and coordination.

Offset Size Field Type Description
29 8 bytes Master Identifier int64 Unique identifier for the master process
37 4 bytes Master Heartbeat int32 Heartbeat counter for the master process
41 4 bytes Number of Workers int32 Total number of concurrent worker tickets
45 1 byte Is Read Allowed bool Flag indicating if read operations are allowed (0=false, 1=true)

Total Size: 17 bytes

Worker Tickets (Starting at Byte 46)

Worker tickets are dynamically sized based on the number of workers specified in the concurrency header. Each ticket is 30 bytes.

Base Offset Calculation: 46 + (ticketIndex * 30)

Single Ticket Structure

Relative Offset Size Field Type Description
0 4 bytes Identifier int32 Unique identifier for this worker
4 4 bytes Worker Heartbeat int32 Heartbeat counter for this worker
8 1 byte Ticket State byte (enum) Current state of the ticket (see SweepstoreTicketState)
9 1 byte Ticket Operation byte (enum) Current operation being performed (see SweepstoreTicketOperation)
10 8 bytes Key Hash int64 Hash of the key being operated on
18 8 bytes Write Pointer int64 Pointer to the write location
26 4 bytes Write Size int32 Size of the write operation

Ticket Size: 30 bytes

Enumerations

Enum fields are stored as single-byte integers. The following tables show the integer values for each enum state:

SweepstoreTicketState (1 byte)

Value Name Description
0 IDLE Ticket is idle and not performing any work
1 WAITING Ticket is waiting for approval
2 APPROVED Ticket has been approved to proceed
3 EXECUTING Ticket is actively executing an operation
4 COMPLETED Ticket has completed its operation

SweepstoreTicketOperation (1 byte)

Value Name Description
0 NONE No operation assigned
1 READ Read operation
2 MODIFY Modify operation
3 WRITE Write operation

Total Header Size Calculation

The total header size depends on the number of workers:

Total Header Size = 46 + (numberOfWorkers * 30) bytes

For example:

  • 4 workers: 46 + (4 <20> 30) = 166 bytes
  • 8 workers: 46 + (8 <20> 30) = 286 bytes

Initialization

When initializing a new Sweepstore file using initialiseSweepstoreHeader():

  • Magic number is set to "SWPT"
  • Version is set to "undefined"
  • Address table pointer is set to null pointer
  • Free list count is set to 0
  • Is free list lifted flag is set to false
  • Master identifier and heartbeat are set to 0
  • Number of workers is set according to the parameter (default: 4)
  • Read allowed flag is set to false
  • All worker tickets are initialized with identifier set to 0, heartbeat set to 0, IDLE state (0), and NONE operation (0)

Implementation Notes

  • All multi-byte integers are stored in little-endian byte order
  • The version string is padded with spaces and prefixed with a space character
  • Boolean values are stored as single bytes (0 or 1)
  • Enum values are stored as single-byte integers using their index values (0, 1, 2, etc.)
  • Pointers use int64 for addressing, with -1 representing a null pointer
  • The header is designed for concurrent access with heartbeat-based liveness detection