spy documentation

spy is a CLI and API for processing streams in Python.

Contents

Introduction

spy is a Python CLI. It’s quite powerful, as you’ll see below, but let’s start with the basics: you feed it a Python expression, it spits out the result.

$ spy '3*4'
12

There’s no need to import modules—just use them and spy will make sure they’re available:

$ spy 'math.pi'
3.141592653589793

I/O

Standard input is exposed as a file-like object called pipe:

$ cat test.txt
this
file
has
five
lines
$ spy 'pipe.readline()' < test.txt
this

It’s a io.TextIOBase, with a couple of extra features: You can index into it, or convert all of stdin into a string with str().

$ spy 'pipe[1]' < test.txt
file
$ spy 'pipe[1::2]' < test.txt
['file', 'five']
$ spy 'str(pipe).replace("\n", " ")' < test.txt
this file has five lines

Passing -l (or --each-line) to spy will iterate through stdin instead, so your expressions will run once per line of input:

$ spy -l '"-%s-" % pipe' < test.txt
-this-
-file-
-has-
-five-
-lines-

spy helpfully removes the terminating newlines from these strings. If you don’t want that, you can pass --raw to get stdin unadulterated.

$ spy -lrc repr < test.txt
'this\n'
'file\n'
'has\n'
'five\n'
'lines\n'

Piping

Much like the standard assortment of unix utilities, which expect to have their inputs and outputs wired up to each other in order to do useful things, each fragment processes some data then passes it on to the next one.

Data passes from left to right. Fragments can return the special constant spy.DROP to prevent further processing of the current datum and continue to the next.

$ spy '3' 'pipe * 2' 'pipe * "!"'
!!!!!!
$ spy -l 'if pipe.startswith("f"): pipe = spy.DROP' < test.txt
this
has
lines

Limiting output

--start=<integer>, -s <integer>

Start printing output at this zero-based index.

--end=<integer>, -e <integer>

Stop processing at this zero-based index.

-s and -e mirror Python’s slice semantics, so -s 1 -e 3 will show results 1 and 2. This means -e on its own is equivalent to a limit on the number of results.

Once the result specified by -e has been hit, no more data will be processed.

Data flow

Before we construct anything more complex, a brief discourse into how data moves around in spy: Each fragment in spy tries to consume data from the fragment to its left. It processes it, then yields to the fragment to its right, which will do the same thing. To run the program, spy just tries to pump as much data out of the rightmost fragment as it can—everything else is handled by the fragment mechanic.

In the examples I’ve given above, each fragment has consumed and yielded data on a one-to-one basis, but there’s no inherent reason for that restriction. Fragments can yield or consume (or both) multiple values using spy.many and spy.collect, respectively.

Decorators

In one example above, we used an if statement to filter by a predicate. That’s far from elegant—by my rough guess, about half the characters in the fragment are boilerplate. spy provides some function decorators to avoid repeating this and a few other common constructs—they’re available as flags from the CLI:

--accumulate <fragment>, -a <fragment>

passes the the result of spy.collect() to the fragment.

--callable <fragment>, -c <fragment>

calls whatever the following fragment returns, with a single argument: the input value to the fragment.

--filter <fragment>, -f <fragment>

filters the data stream, using the fragment as a predicate: if it returns any true value, the data passes through, but if it returns a false value spy.DROP is returned instead.

--keywords <fragment>, -k <fragment>

executes the fragment using its own input value as the local scope, which must be a mapping. Names from the global scope (but not pipe) are still available unless shadowed by keys in the input mapping.

--many <fragment>, -m <fragment>

calls spy.many() with the return value of the fragment (which must be iterable).

Literal decorators

Literal decorators are a kind of decorator that accept string arguments rather than Python code.

--interpolate <string>, -i <string>

uses <string> as a str.format() format string on the input. Positional parameters like {0} index into the input value, and named ones access the local scope of the fragment, so the full input value is available as {pipe}.

$ spy -li '-{pipe}-' < test.txt
-this-
-file-
-has-
-five-
-lines-
--regex <string>, --regexp <string>, -R <string>

matches the input against <string> as a regexp using re.match().

$ spy -lR 'f.*' -fc id -i '{0}' < test.txt
file
five

Deferred application

spy overloads callable objects (when they’re builtins or autoimported) to add implementations of most Python operators. These return a function that calls the original function and then applies the specified operation. They take a single argument only, and are essentially just a shortcut that lets you avoid typing (pipe) in some cases:

$ spy '[1,2,3]' -c 'sum/2'
3.0
$ spy '[1,2,3]' -c 'sum/len'
2.0

Doing stuff

Nothing here is particularly useful in isolation. Let’s throw it all together by pretending we’re jq:

$ spy -lc json.loads -fk '"Rutile" in export_commodities' -k name -e 10 < stations.jsonl
Hieb Orbital
Hahn Terminal
Anderson Colony
So-yeon Mines
Williamson Enterprise
Julian Hub
Fancher Enterprise
Neville Vision
Raleigh Terminal
Arrhenius Beacon

Note how -l trivially gives us newline-delimited JSON, a job which was previously so hard it required its own top-2000 PyPI package!

Exception handling

If your code raises an uncaught exception, spy will try to intercept and reformat the traceback, omitting the frames from spy’s own machinery. Special frames will be inserted where appropriate describing the fragment’s position, source code, and input data at the time the exception was raised:

$ spy 'None + 2'
Traceback (most recent call last):
  Fragment 1
    None + 2
    input to fragment was <SpyFile stream='<stdin>'>
TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'

If an exception is raised in a decorator outside the call to the fragment body, the fragment is mentioned anyway. This is not strictly true, given that none of the code in the fragment takes part in the call stack in this case, but this particular lie is almost universally more useful:

$ spy -c None
Traceback (most recent call last):
  Fragment 1, in decorator spy.decorators.callable
    --callable 'None'
    input to fragment was <SpyFile stream='<stdin>'>
TypeError: 'NoneType' object is not callable

The philosophy here is that what made it go wrong is more interesting than exactly how it went wrong, so that’s what spy gives you by default. You can get the real traceback by passing --no-exception-handling to spy.

CLI reference

Regular options

--break

Start a post-mortem debugging session with pdb if an exception occurs during execution.

--each-line, -l

Process each line as its own string (rather than stdin as a file at once).

Equivalent to starting with a fragment of spy.many(pipe), but more efficient since we don’t need save the contents of the input stream for indexing.

--no-default-fragments

Don’t add any fragments to the chain that weren’t explicitly specified in the command line.

--no-exception-handling

Disable spy’s exception handling and reformatting. This is mostly only useful for debugging changes to spy itself.

--pipe-name=<name>

Name the magic pipe variable <name> instead of pipe.

--prelude=<statement>, -p <statement>

Run some Python before processing starts.

--raw, -r

Don’t wrap stdin before passing it to the first fragment.

Output limiting options

The index arguments for these options refer to results, not input. If a single piece of input data results in 4 separate pieces of output, they’ll all count.

--start=<index>, -s <index>

Start printing results at this zero-based index.

--end=<index>, -e <index>

Stop processing data at this zero-based index.

Decorators

Decorator options must precede a code step. Multiple decorators can stack together. They have exactly the same effect as decorating a function in Python.

See the decorator API docs for a list of them.

Alternative actions

--help, -h

Show usage and option descriptions.

--show-fragments

Print out a list of string representations of the complete fragment chain that would be executed.

spy from Python

The introduction showed how to use spy from the command line. That’s not the only way: spy works just as well from other Python code. The CLI is just a wrapper around spy’s public API to make it easier to get to.

I don’t think it is useful in very many cases as a Python library, but if you want to create an alternative command-line interface for example, this may be of interest.

API documentation is available. What follows is a (very) brief guide, which I hope to expand on in the future.

Basic usage

As with the CLI, you create fragments and then pass data through them. And, as with the CLI, creating fragments is easy. You decorate a regular function with spy.fragment():

import spy

@spy.fragment
def add_five(v):
    return v + 5

So, on to the feeding data part. You don’t feed data to fragments on their own, but to chains, so let’s create one:

chain = spy.chain([add_five])

In order to feed data into it, call the chain object with an iterable to feed into the chain. The call will return an iterable of the results:

data = [1, 2, 3, 4]
print(list(chain(data)))  # [6, 7, 8, 9]

These iterators don’t interfere with each other, even if they’re created by the same chain object, so one chain can be used to process multiple independent sets of input data.

Differences from the CLI

As documented, collect() takes a context argument. It can be omitted when using the CLI because it’s automatically filled in (it has to be, since there’s no way to access the context object from CLI fragments). There is no equivalent mechanism outside the CLI, so if you want to use collect(), you must provide context. You can get the context object by accepting a context argument in your fragment function:

@spy.fragment
def foo(v, context):
   c = spy.collect(context)
   # do stuff with c

Extending spy

spy supports a very simple extension mechanism using entry points. A package wishing to extend its functionality should provide an entry point for the group spy.init. The entry point will be called with no arguments before spy does anything.

There are two primary means of extending spy: adding fragment decorators, and adding functions to prelude.

Extending the prelude

Adding functions to the prelude is trivial: simply import spy.prelude and put things in it.

from spy import prelude
prelude.uc = lambda s: str(s).upper()
$ spy -c uc <<< hello
HELLO

Adding decorators

spy’s decorators are created using a helper decorator, spy.decorators.decorator(), which does a lot of work to set up exception handling and deal with the optional context argument to fragments. Because of this, the form decorators must take is slightly prescribed. Basic usage is as follows:

@decorator('--uppercase', '-U', doc='Make the result uppercase')
def uppercase(fn, v, context):
    return str(fn(v, context)).upper()
$ spy -u pipe <<< hello
HELLO

In general, the function to which decorator() is applied to three arguments for the decorated fragment function, the input value and the context. fn is adjusted to always take the context argument for simplicity. The decorator function is responsible for calling fn and returning the result.

If it’s advantageous to do some setup first, it can be pulled into a function and passed as the prep keyword argument. Its return value will be passed as a fourth argument to the decorator.

def _prep_cached(fn):
    return {}

@decorator('--cached', '-C', doc='Cache this fragment', prep=_prep_cached)
def cached(fn, v, context, cache):
    if v not in cache:
        cache[v] = fn(v, context)
    return cache[v]
$ spy -m '[1,2,2,2,3,4]' -Cc print
1
2
3
4

Finally, if your decorator should take a literal string rather than a fragment, use the takes_string parameter. The decorator API is as above, except that the fragment function will return a tuple of its execution scope and the string.

@decorator('--template', '-t', doc='Template this string', takes_string=True)
def template(fn, v, context):
    env, s = fn(v, context)
    return string.Template(s).substitute(env)
$ spy '{"a": 10, "b": 20}' -kt '$a $b'
10 20

API

Detailed documentation for spy’s API.

spy

This module exposes spy’s core API.

See also

spy.decorators
Function decorators for use with spy fragments
Constants
spy.DROP

A signaling object: when returned from a fragment, the fragment will not yield any value.

Exceptions
exception spy.CaughtException[source]
print_traceback()[source]

Print the (formatted) traceback captured when the exception was raised.

Functions
class spy.catch

A context manager. Exceptions raised in the context will be subject to spy’s traceback formatting and wrapped in a CaughtException. If these are not caught, spy uses an exception hook to force them to be formatted properly. If you opt to catch CaughtException instead, you can use its print_traceback() method to print the formatted traceback without exiting.

class spy.chain(seq)[source]

Construct a chain of fragments from seq.

Parameters:seq (sequence) – Fragments to chain together
__call__(data)

Alias for apply().

apply(data)[source]

Feed data into the fragment chain, and return an iterator over the resulting data.

classmethod auto_fragments(seq)[source]

Like the regular constructor, but for each element in seq, apply fragment() to it if it isn’t already a fragment.

Items in seq must be either regular functions (not generators) or fragments.

run_to_exhaustion(data)[source]

Call apply(), then iterate until the chain runs out of data.

spy.collect(context)[source]

Return an iterator of the elements being processed by the current fragment. Can be used to write a fragment that consumes multiple items.

@spy.fragment[source]

Given a callable func, return a fragment that calls func to process data. func must take at least one positional argument, a single value to process and return.

Optionally it can take another argument, called context. If it does, a context object will be passed to it on each invocation. This object has no documented public functionality; its purpose is to be passed to spy API functions that require it (namely collect()).

spy.many(ita)[source]

Return a signaling object that instructs spy to yield values from ita from the current fragment, instead of yielding only one value.

spy.decorators

This module contains various function decorators for use in spy fragments.

@spy.decorators.accumulate[source]
--accumulate, -a

Accumulate values into an iterator by calling spy.collect(), and pass that to the fragment.

This can be used to write a fragment which executes at most once while passing data through:

-ma 'x = y;'

@spy.decorators.callable[source]
--callable, -c

Call the result of the decorated fragment

@spy.decorators.filter[source]
--filter, -f

Use the decorated fragment as a predicate—only elements for which the fragment returns a true value will be passed through.

@spy.decorators.keywords[source]
--keywords, -k

On fragments generated by the CLI, sets the local scope to the input value before each invocation. Normal Python functions cannot do this—trying to decorate them will raise ValueError.

@spy.decorators.many[source]
--many, -m

Call spy.many() with the result of the fragment.

@spy.decorators.focus(ITEM)[source]
--focus=ITEM, -o ITEM

Operate on pipe[ITEM]. The result will be assigned to the same position in a shallow copy of the input, which will need to be mutable despite not normally being modified.

On the CLI, if ITEM is a decimal integer, it will be interpreted as an integer index. If instead if starts with a dot ., everything after the dot will be taken as a literal string key.

If you have lenses installed, some more forms are available. Two or three decimal integers separated by : will focus on each element of a slice of the input:

$ spy '[1,2,3,4,5,6]' -o 1::2 'pipe * 7'
[1, 14, 3, 28, 5, 42]

And a string starting with _ will be evaluated as a Python expression with _ bound to lens, allowing you to focus with arbitrary lenses:

$ spy '["abc", "def"]' -o '_.Each()[1]' -c str.upper
['aBc', 'dEf']

Natively-understood focuses will be turned into lenses too, allowing them to operate on any immutable object that lenses can handle.

@spy.decorators.magnify(ITEM)[source]
--magnify=ITEM, -O ITEM

As focus(), except that the result is returned as-is, rather than spliced into a copy of the input. The portion of the input that was not magnified is thus discarded.

Magnifying with a lens that has multiple foci will simply use the first one. Further work on this area is aspired to.

@spy.decorators.try_except[source]
--try, -t

Filter out input that causes the fragment to raise an exception. This is the equivalent of a try: except:-block in the fragment.

Literal decorators

On the CLI, these decorators take a literal string rather than Python code. In Python-land, they expect to decorate a function that returns (scope, string). They’re especially pointless for non-CLI uses, and this documentation is written with CLI usage in mind.

@spy.decorators.interpolate
--interpolate, -i

Interpolate the literal argument as a str.format() format string.

Keyword substitutions ({foo}) look up variable names. Positional substitutions ({2}) are indexes into the value being processed.

@spy.decorators.regex[source]
--regex, --regexp, -R

Match a regexp against the input using re.match().

Defining decorators

For integration with spy’s CLI and exception handling, decorators should be created using decorator().

@spy.decorators.decorator(name, *aliases[, doc=None][, prep=None][, takes_string=False])[source]

Turns a wrapper function into a spy decorator.

name and aliases are the CLI options that should refer to this decorator; doc is the help output to be printed next to it by --help.

If prep is passed, it must be a callable taking one argument, the callable we are about to decorate, and the wrapper will be called as:

wrapper(fn, v, context, opaque)

where opaque is whatever prep returns. Otherwise, the wrapper will be called with the first three arguments only.

If takes_string is True, the command-line option will consume a literal string instead of Python code, and fn will return a tuple of its local scope and the literal string value.

For usage examples, see Adding decorators.

spy.prelude

Utilities that are automatically imported during spy runs and of little use elsewhere.

spy.prelude.id(x)[source]

Return x.

spy.prelude.ft(*args)[source]

Return a tuple that, when called, calls all its members with the same argument.

Examples

Sort

$ spy -mc sorted < test.txt
file
five
has
lines
this

Similarly:

$ spy -mc reversed < test.txt
lines
five
has
file
this

Filter

$ spy -l -fc 'len == 4' < test.txt
this
file
five

Enumerate

Naively:

$ spy -m "['{}: {}'.format(n, v) for n, v in enumerate(pipe, 1)]" < test.txt
1: this
2: file
3: has
4: five
5: lines

Taking advantage of spy piping:

$ spy -m 'enumerate(pipe, 1)' -i '{}: {}' < test.txt
1: this
2: file
3: has
4: five
5: lines

Convert CSV to JSON

$ spy -c csv.DictReader -c list -c json.dumps < thing.csv > thing.json

Glossary

fragment

An object which can be used by spy.chain() to create chained iterators.

The following kinds of object only are considered fragments:

  • The return value of a successful call to spy.fragment()
  • A generator taking exactly one argument, the iterable to get input values from.

Note

In any given version of spy, it’s possible that other objects may work as fragments. This is not part of the API, and any accidental support for using other objects may go away at any time.