Simdjsont
simdjsont · API reference
High-level API.
This module provides convenience functions for:
- validating JSON (
Validate) - extracting values by JSON pointer (
Extract) - encoding/decoding typed values via codecs (
Codec,Decode,Encode) - decoding NDJSON / JSON-Lines streams (
Ndjson)
Strings and bigstrings
Every consume/produce entry point comes in two flavours: one that works on OCaml strings, and a _bigstring sibling that works on Raw.buffer — the standard bigstring type (char, int8_unsigned_elt, c_layout) Bigarray.Array1.t, shared with Bigstringaf, Lwt_bytes and Core.Bigstring. The bigstring path avoids copying through a string: parsing is zero-copy when the buffer already has the padding bytes the parser requires (see Raw.ensure_padded), and encode_bigstring produces a parse-ready buffer that can be fed straight back into decode_bigstring.
For low-level access to the underlying simdjson parser and elements, see Raw.
Example: typed records
type user = { id : int; name : string; email : string option }
let user_codec =
let open Simdjsont.Codec in
Obj.field (fun id name email -> { id; name; email })
|> Obj.mem "id" int ~enc:(fun u -> u.id)
|> Obj.mem "name" string ~enc:(fun u -> u.name)
|> Obj.opt_mem "email" string ~enc:(fun u -> u.email)
|> Obj.finish
let decoded =
Simdjsont.decode user_codec
{|{"id":1,"name":"Ada","email":"ada@example.test"}|}
let encoded =
Simdjsont.encode user_codec
{ id = 1; name = "Ada"; email = Some "ada@example.test" }Choosing an API
Use Validate to check whether bytes are JSON, Extract to read a few values by JSON pointer, Decode / Encode for typed application data, Ndjson for streams of documents, Cbor for JSON-compatible CBOR, and Raw only when you need low-level simdjson access.
module Json : sig ... endDynamic JSON value representation used by this library.
module Codec : sig ... endCodecs used to decode and encode typed OCaml values.
module Raw : sig ... endLow-level bindings: parsers, elements, arrays, objects, JSON pointers, and streaming. Prefer the high-level modules unless you need parser lifetime control or byte offsets.
module Validate : sig ... endJSON validity checks.
module Extract : sig ... endExtract values from a JSON string using a JSON pointer.
module Decode : sig ... endCodecs and decoding functions.
module Encode : sig ... endEncoding using a codec.
NDJSON / JSON streams
Newline-delimited / concatenated JSON documents (NDJSON, JSON Lines), decoded lazily through a codec. This wraps the lower-level Raw.Stream. Note: this is not JSON-LD (W3C linked data).
let events = "1\n2\n3\n" in
Simdjsont.Ndjson.decode_string_seq Simdjsont.Codec.int events
|> Seq.iter (function
| Ok n -> Printf.printf "event=%d\n" n
| Error msg -> Printf.eprintf "bad event: %s\n" msg)module Ndjson : sig ... endval padding : intNumber of trailing padding bytes the parser requires; re-export of Raw.padding.
val create_bigstring : int -> Raw.bufferAllocate a parse-ready bigstring with room for the given number of data bytes (padding is added automatically); re-export of Raw.create_buffer.
val bigstring_of_string : string -> Raw.bufferCopy a string into a freshly padded bigstring; re-export of Raw.buffer_of_string.
val validate : string -> boolvalidate json is a convenience alias for Validate.is_valid.
val validate_bigstring : Raw.buffer -> len:int -> boolvalidate_bigstring buf ~len is a convenience alias for Validate.is_valid_bigstring.
val decode : 'a Codec.t -> string -> ('a, string) resultdecode codec json is a convenience alias for Codec.decode_string.
val decode_bigstring :
'a Codec.t ->
Raw.buffer ->
len:int ->
('a, string) resultdecode_bigstring codec buf ~len is a convenience alias for Codec.decode_bigstring.
val encode : 'a Codec.t -> 'a -> stringencode codec value is a convenience alias for Codec.encode_string.
val encode_bigstring : 'a Codec.t -> 'a -> Raw.buffer * intencode_bigstring codec value is a convenience alias for Codec.encode_to_bigstring.
CBOR
CBOR (Concise Binary Object Representation, RFC 8949) support for the JSON-compatible subset of CBOR, using the same codec infrastructure as JSON. This means a record codec can be shared by JSON and CBOR:
type point = { x : int; y : int }
let point =
let open Simdjsont.Codec in
Obj.field (fun x y -> { x; y })
|> Obj.mem "x" int ~enc:(fun p -> p.x)
|> Obj.mem "y" int ~enc:(fun p -> p.y)
|> Obj.finish
let bytes = Simdjsont.Cbor.encode_string point { x = 10; y = 20 }
let decoded = Simdjsont.Cbor.decode_string point bytesSupported CBOR types:
- Integers (major types 0, 1) including 64-bit
- Floats (major type 7) including half-precision (16-bit) and single/double precision
- Byte strings (major type 2), represented as OCaml strings because the public JSON-compatible model has no separate bytes constructor
- Text strings (major type 3) including indefinite-length
- Arrays (major type 4) including indefinite-length
- Maps (major type 5) with text-string or byte-string keys, including indefinite-length maps
- Tags (major type 6), decoded by ignoring the tag number and decoding the tagged value
- Simple values: true, false, null, and undefined (decoded as null) Not supported as distinct semantic values:
- Semantic interpretation of tags such as dates, URIs, encoded CBOR, or bignums; tags are currently transparent wrappers
- Integer map keys
- Arbitrary-precision integers beyond 64-bit integer codecs
- CBOR simple values other than booleans, null, and undefined
module Cbor : sig ... end