Skip to content

Human-readable text syntax

The preserves.text module implements the Preserves human-readable text syntax.

The main entry points are functions stringify, parse, and parse_with_annotations.

>>> stringify(Record(Symbol('hi'), [1, [2, 3]]))
'<hi 1 [2 3]>'
>>> parse('<hi 1 [2 3]>')
#hi(1, (2, 3))

Formatter(format_embedded=lambda x: x, indent=None, with_commas=False, trailing_comma=False, include_annotations=True)

Bases: TextCodec

Printer (and indenting pretty-printer) for producing human-readable syntax from Preserves Values.

>>> f = Formatter()
>>> f.append({'a': 1, 'b': 2})
>>> f.append(Record(Symbol('label'), ['field1', ['field2item1', 'field2item2']]))
>>> print(f.contents())
{"a": 1 "b": 2} <label "field1" ["field2item1" "field2item2"]>

>>> f = Formatter(indent=4)
>>> f.append({'a': 1, 'b': 2})
>>> f.append(Record(Symbol('label'), ['field1', ['field2item1', 'field2item2']]))
>>> print(f.contents())
{
    "a": 1
    "b": 2
}
<label "field1" [
    "field2item1"
    "field2item2"
]>

Parameters:

Name Type Description Default
format_embedded

function accepting an Embedded.embeddedValue and returning a Value for serialization.

lambda x: x
indent int | None

None disables indented pretty-printing; otherwise, an int specifies indentation per nesting-level.

None
with_commas bool

True causes commas to separate sequence and set items and dictionary entries; False omits commas.

False
trailing_comma bool

True causes a comma to be printed after the final item or entry in a sequence, set or dictionary; False omits this trailing comma

False
include_annotations bool

True causes annotations to be included in the output; False causes them to be omitted.

True

Attributes:

Name Type Description
indent_delta int

indentation per nesting-level

chunks list[str]

fragments of output

Source code in preserves/text.py
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
def __init__(self,
             format_embedded=lambda x: x,
             indent=None,
             with_commas=False,
             trailing_comma=False,
             include_annotations=True):
    super(Formatter, self).__init__()
    self.indent_delta = 0 if indent is None else indent
    self.indent_distance = 0
    self.nesting = 0
    self.with_commas = with_commas
    self.trailing_comma = trailing_comma
    self.chunks = []
    self._format_embedded = format_embedded
    self.include_annotations = include_annotations

append(v)

Extend self.chunks with at least one chunk, together making up the text representation of v.

Source code in preserves/text.py
541
542
543
544
545
546
547
548
549
550
def append(self, v):
    """Extend `self.chunks` with at least one chunk, together making up the text
    representation of `v`."""
    if self.chunks and self.nesting == 0:
        self.write_indent_space()
    try:
        self.nesting += 1
        self._append(v)
    finally:
        self.nesting -= 1

contents()

Returns a str constructed from the join of the chunks in self.chunks.

Source code in preserves/text.py
492
493
494
def contents(self):
    """Returns a `str` constructed from the join of the chunks in `self.chunks`."""
    return u''.join(self.chunks)

is_indenting()

Returns True iff this Formatter is in pretty-printing indenting mode.

Source code in preserves/text.py
496
497
498
499
def is_indenting(self):
    """Returns `True` iff this [Formatter][preserves.text.Formatter] is in pretty-printing
    indenting mode."""
    return self.indent_delta > 0

Parser(input_buffer='', include_annotations=False, parse_embedded=lambda x: x)

Bases: TextCodec

Parser for the human-readable Preserves text syntax.

Parameters:

Name Type Description Default
input_buffer str

initial contents of the input buffer; may subsequently be extended by calling extend.

''
include_annotations bool

if True, wrap each value and subvalue in an Annotated object.

False
parse_embedded

function accepting a Value and returning a possibly-decoded form of that value suitable for placing into an Embedded object.

lambda x: x

Normal usage is to supply input text, and keep calling next until a ShortPacket exception is raised:

>>> d = Parser('123 "hello" @x []')
>>> d.next()
123
>>> d.next()
'hello'
>>> d.next()
()
>>> d.next()
Traceback (most recent call last):
  ...
preserves.error.ShortPacket: Short input buffer

Alternatively, keep calling try_next until it yields None, which is not in the domain of Preserves Values:

>>> d = Parser('123 "hello" @x []')
>>> d.try_next()
123
>>> d.try_next()
'hello'
>>> d.try_next()
()
>>> d.try_next()

For convenience, Parser implements the iterator interface, backing it with try_next, so you can simply iterate over all complete values in an input:

>>> d = Parser('123 "hello" @x []')
>>> list(d)
[123, 'hello', ()]
>>> for v in Parser('123 "hello" @x []'):
...     print(repr(v))
123
'hello'
()

Supply include_annotations=True to read annotations alongside the annotated values:

>>> d = Parser('123 "hello" @x []', include_annotations=True)
>>> list(d)
[123, 'hello', @#x ()]

If you are incrementally reading from, say, a socket, you can use extend to add new input as if comes available:

>>> d = Parser('123 "he')
>>> d.try_next()
123
>>> d.try_next() # returns None because the input is incomplete
>>> d.extend('llo"')
>>> d.try_next()
'hello'
>>> d.try_next()

Attributes:

Name Type Description
input_buffer str

buffered input waiting to be processed

index int

read position within input_buffer

Source code in preserves/text.py
132
133
134
135
136
137
def __init__(self, input_buffer=u'', include_annotations=False, parse_embedded=lambda x: x):
    super(Parser, self).__init__()
    self.input_buffer = input_buffer
    self.index = 0
    self.include_annotations = include_annotations
    self.parse_embedded = parse_embedded

extend(text)

Appends text to the remaining contents of self.input_buffer, trimming already-processed text from the front of self.input_buffer and resetting self.index to zero.

Source code in preserves/text.py
139
140
141
142
143
def extend(self, text):
    """Appends `text` to the remaining contents of `self.input_buffer`, trimming already-processed
    text from the front of `self.input_buffer` and resetting `self.index` to zero."""
    self.input_buffer = self.input_buffer[self.index:] + text
    self.index = 0

next()

Reads the next complete Value from the internal buffer, raising ShortPacket if too few bytes are available, or DecodeError if the input is invalid somehow.

Source code in preserves/text.py
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
def next(self):
    """Reads the next complete `Value` from the internal buffer, raising
    [ShortPacket][preserves.error.ShortPacket] if too few bytes are available, or
    [DecodeError][preserves.error.DecodeError] if the input is invalid somehow.

    """
    self.skip_whitespace()
    c = self.peek()
    if c == '"':
        self.skip()
        return self.wrap(self.read_string('"'))
    if c == '|':
        self.skip()
        return self.wrap(Symbol(self.read_string('|')))
    if c == '@':
        self.skip()
        return self.unshift_annotation(self.next(), self.next())
    if c == ';':
        raise DecodeError('Semicolon is reserved syntax')
    if c == ':':
        raise DecodeError('Unexpected key/value separator between items')
    if c == '#':
        self.skip()
        c = self.nextchar()
        if c in ' \t': return self.unshift_annotation(self.comment_line(), self.next())
        if c in '\n\r': return self.unshift_annotation('', self.next())
        if c == '!':
            return self.unshift_annotation(
                Record(Symbol('interpreter'), [self.comment_line()]),
                self.next())
        if c == 'f': self.require_delimiter('#f'); return self.wrap(False)
        if c == 't': self.require_delimiter('#t'); return self.wrap(True)
        if c == '{': return self.wrap(self.read_set())
        if c == '"': return self.wrap(self.read_literal_binary())
        if c == 'x':
            c = self.nextchar()
            if c == '"': return self.wrap(self.read_hex_binary())
            if c == 'd': return self.wrap(self.read_hex_float())
            raise DecodeError('Invalid #x syntax')
        if c == '[': return self.wrap(self.read_base64_binary())
        if c == ':':
            if self.parse_embedded is None:
                raise DecodeError('No parse_embedded function supplied')
            return self.wrap(Embedded(self.parse_embedded(self.next())))
        raise DecodeError('Invalid # syntax')
    if c == '<':
        self.skip()
        vs = self.upto('>', False)
        if len(vs) == 0:
            raise DecodeError('Missing record label')
        return self.wrap(Record(vs[0], vs[1:]))
    if c == '[':
        self.skip()
        return self.wrap(self.upto(']', True))
    if c == '{':
        self.skip()
        return self.wrap(self.read_dictionary())
    if c in '>]},':
        raise DecodeError('Unexpected ' + c)
    self.skip()
    return self.wrap(self.read_raw_symbol_or_number([c]))

try_next()

Like next, but returns None instead of raising ShortPacket.

Source code in preserves/text.py
385
386
387
388
389
390
391
392
393
def try_next(self):
    """Like [next][preserves.text.Parser.next], but returns `None` instead of raising
    [ShortPacket][preserves.error.ShortPacket]."""
    start = self.index
    try:
        return self.next()
    except ShortPacket:
        self.index = start
        return None

parse(text, **kwargs)

Yields the first complete encoded value from text, passing kwargs through to the Parser constructor. Raises exceptions as per next.

Parameters:

Name Type Description Default
text str

encoded data to decode

required
Source code in preserves/text.py
404
405
406
407
408
409
410
411
412
413
def parse(text, **kwargs):
    """Yields the first complete encoded value from `text`, passing `kwargs` through to the
    [Parser][preserves.text.Parser] constructor. Raises exceptions as per
    [next][preserves.text.Parser.next].

    Args:
        text (str): encoded data to decode

    """
    return Parser(input_buffer=text, **kwargs).next()

parse_with_annotations(bs, **kwargs)

Like parse, but supplying include_annotations=True to the Parser constructor.

Source code in preserves/text.py
415
416
417
418
def parse_with_annotations(bs, **kwargs):
    """Like [parse][preserves.text.parse], but supplying `include_annotations=True` to the
    [Parser][preserves.text.Parser] constructor."""
    return Parser(input_buffer=bs, include_annotations=True, **kwargs).next()

stringify(v, **kwargs)

Convert a single Value v to a string. Any supplied kwargs are passed on to the underlying Formatter constructor.

Source code in preserves/text.py
602
603
604
605
606
607
def stringify(v, **kwargs):
    """Convert a single `Value` `v` to a string. Any supplied `kwargs` are passed on to the
    underlying [Formatter][preserves.text.Formatter] constructor."""
    e = Formatter(**kwargs)
    e.append(v)
    return e.contents()