Eio.Buf_read
Buffered input and parsing.
This module provides fairly efficient non-backtracking parsers. It is modelled on Angstrom's API, and you should use that if backtracking is needed.
Example:
let r = Buf_read.of_flow flow ~max_size:1_000_000 in
Buf_read.line r
Raised if parsing an item would require enlarging the buffer beyond its configured limit.
type 'a parser = t -> 'a
An 'a parser
is a function that consumes and returns a value of type 'a
.
val parse :
?initial_size:int ->
max_size:int ->
'a parser ->
_ Flow.source ->
('a, [> `Msg of string ]) result
parse p flow ~max_size
uses p
to parse everything in flow
.
It is a convenience function that does
let buf = of_flow flow ~max_size in
format_errors (p <* end_of_input) buf
val parse_exn :
?initial_size:int ->
max_size:int ->
'a parser ->
_ Flow.source ->
'a
parse_exn
wraps parse
, but raises Failure msg
if that returns Error (`Msg msg)
.
Catching exceptions with parse
and then raising them might seem pointless, but this has the effect of turning e.g. an End_of_file
exception into a Failure
with a more user-friendly message.
parse_string p s
uses p
to parse everything in s
. It is defined as format_errors (p <* end_of_input) (of_string s)
val parse_string_exn : 'a parser -> string -> 'a
parse_string_exn
is like parse_string
, but handles errors like parse_exn
.
val of_flow : ?initial_size:int -> max_size:int -> _ Flow.source -> t
of_flow ~max_size flow
is a buffered reader backed by flow
.
val of_buffer : Cstruct.buffer -> t
of_buffer buf
is a reader that reads from buf
. buf
is used directly, without being copied. eof_seen (of_buffer buf) = true
. This module will not modify buf
itself, but it will expose it via peek
.
val of_string : string -> t
of_string s
is a reader that reads from s
.
val as_flow : t -> Flow.source_ty Std.r
as_flow t
is a buffered flow.
Reading from it will return data from the buffer, only reading the underlying flow if the buffer is empty.
val line : string parser
line
parses one line.
Lines can be terminated by either LF or CRLF. The returned string does not include the terminator.
If End_of_file
is reached after seeing some data but before seeing a line terminator, the data seen is returned as the last line.
lines
returns a sequence that lazily reads the next line until the end of the input is reached.
lines = seq line ~stop:at_end_of_input
val char : char -> unit parser
char c
checks that the next byte is c
and consumes it.
val any_char : char parser
any_char
parses one character.
val peek_char : char option parser
peek_char
returns Some c
where c
is the next character, but does not consume it.
Returns None
at the end of the input stream rather than raising End_of_file
.
val string : string -> unit parser
string s
checks that s
is the next string in the stream and consumes it.
val uint8 : int parser
uint8
parses the next byte as an unsigned 8-bit integer.
module BE : sig ... end
Big endian parsers
module LE : sig ... end
Little endian parsers
val take : int -> string parser
take n
takes exactly n
bytes from the input.
val take_all : string parser
take_all
takes all remaining data until end-of-file.
Returns ""
if already at end-of-file.
val take_while : (char -> bool) -> string parser
take_while p
finds the first byte for which p
is false and consumes and returns all bytes before that.
If p
is true for all remaining bytes, it returns everything until end-of-file.
It will return the empty string if there are no matching characters (and therefore never raises End_of_file
).
val take_while1 : (char -> bool) -> string parser
take_while1 p
is like take_while
. However, the parser fails with "take_while1" if at least one character of input hasn't been consumed by the parser.
val skip_while : (char -> bool) -> unit parser
skip_while p
skips zero or more bytes for which p
is true
.
skip_while p t
does the same thing as ignore (take_while p t)
, except that it is not limited by the buffer size.
val skip_while1 : (char -> bool) -> unit parser
skip_while1 p
is like skip_while
. However, the parser fails with "skip_while1" if at least one character of input hasn't been skipped.
val skip : int -> unit parser
skip n
discards the next n
bytes.
skip n
= map ignore (take n)
, except that the number of skipped bytes may be larger than the buffer (it will not grow).
Note: if End_of_file
is raised, all bytes in the stream will have been consumed.
val at_end_of_input : bool parser
at_end_of_input
returns true
when at the end of the stream, or false
if there is at least one more byte to be read.
val end_of_input : unit parser
end_of_input
checks that there are no further bytes in the stream.
seq p
is a sequence that uses p
to get the next item.
A sequence node can only be used while the stream is at the expected position, and will raise Invalid_argument
if any bytes have been consumed in the meantime. This also means that each node can only be used once; use Seq.memoize
to make the sequence persistent.
It is not necessary to consume all the elements of the sequence.
pair a b
is a parser that first uses a
to parse a value x
, then uses b
to parse a value y
, then returns (x, y)
.
Note that this module does not support backtracking, so if b
fails then the bytes consumed by a
are lost.
val return : 'a -> 'a parser
return x
is a parser that consumes nothing and always returns x
. return
is just Fun.const
.
map f a
is a parser that parses the stream with a
to get v
, and then returns f v
.
bind a f
is a parser that first uses a
to parse a value v
, then uses f v
to select the next parser, and then uses that.
format_errors p
catches Failure
, End_of_file
and Buffer_limit_exceeded
exceptions and returns them as a formatted error message.
module Syntax : sig ... end
Convenient syntax for some of the combinators.
val buffered_bytes : t -> int
buffered_bytes t
is the number of bytes that can be read without reading from the underlying flow.
peek t
returns a view onto the active part of t
's internal buffer.
Performing any operation that might add to the buffer may invalidate this, so it should be used immediately and then forgotten.
Cstruct.length (peek t) = buffered_bytes t
.
val ensure : t -> int -> unit
ensure t n
ensures that the buffer contains at least n
bytes of data.
If not, it reads from the flow until there is.
buffered_bytes (ensure t n) >= n
.
val consume : t -> int -> unit
val consumed_bytes : t -> int
consumed_bytes t
is the total number of bytes consumed.
i.e. it is the offset into the stream of the next byte to be parsed.
val eof_seen : t -> bool
eof_seen t
indicates whether we've received End_of_file
from the underlying flow.
If so, there will never be any further data beyond what peek
already returns.
Note that this returns false
if we're at the end of the stream but don't know it yet. Use at_end_of_input
to be sure.