spy documentation¶
spy is a CLI and API for processing streams in Python.
Contents¶
Introduction¶
spy is a Python CLI. It’s quite powerful, as you’ll see below, but let’s start with the basics: you feed it a Python expression, it spits out the result.
$ spy '3*4'
12
There’s no need to import modules—just use them and spy will make sure they’re available:
$ spy 'math.pi'
3.141592653589793
I/O¶
Standard input is exposed as a file-like object called pipe
:
$ cat test.txt
this
file
has
five
lines
$ spy 'pipe.readline()' < test.txt
this
It’s a io.TextIOBase
, with a couple of extra features: You can index
into it, or convert all of stdin into a string with str()
.
$ spy 'pipe[1]' < test.txt
file
$ spy 'pipe[1::2]' < test.txt
['file', 'five']
$ spy 'str(pipe).replace("\n", " ")' < test.txt
this file has five lines
Passing -l
(or --each-line
) to spy will iterate through stdin instead,
so your expressions will run once per line of input:
$ spy -l '"-%s-" % pipe' < test.txt
-this-
-file-
-has-
-five-
-lines-
spy helpfully removes the terminating newlines from these strings.
Piping¶
Much like the standard assortment of unix utilities, which expect to have their inputs and outputs wired up to each other in order to do useful things, each fragment processes some data then passes it on to the next one.
Data passes from left to right. Fragments can return the special constant
spy.DROP
to prevent further processing of the current datum and
continue to the next.
$ spy '3' 'pipe * 2' 'pipe * "!"'
!!!!!!
$ spy -l 'if pipe.startswith("f"): pipe = spy.DROP' < test.txt
this
has
lines
Limiting output¶
-
--start
=<integer>
,
-s
<integer>
¶ Start printing output at this zero-based index.
-
--end
=<integer>
,
-e
<integer>
¶ Stop processing at this zero-based index.
-s
and -e
mirror Python’s slice semantics, so -s 1 -e 3
will show
results 1 and 2. This means -e
on its own is equivalent to a limit on the
number of results.
Once the result specified by -e
has been hit, no more data will be
processed.
Data flow¶
Before I explain this, a brief discourse into how data moves around in spy: Each fragment in spy tries to consume data from the fragment to its left. It processes it, then yields to the fragment to its right, which will do the same thing. To run the program, spy just tries to pump as much data out of the rightmost fragment as it can—everything else is handled by the fragment mechanic.
In the examples I’ve given above, each fragment has consumed and yielded data on
a one-to-one basis, but there’s no inherent reason for that restriction.
Fragments can yield or consume (or both) multiple values using
spy.many
and spy.collect
, respectively.
Decorators¶
In one example above, we used an if
statement to filter by a predicate.
That’s far from elegant—by my rough guess, about half the characters in the
fragment are boilerplate. spy provides some function decorators to avoid
repeating this and a few other common constructs—they’re available as flags from
the CLI:
-
--accumulate
<fragment>
,
-a
<fragment>
¶ passes the the result of
spy.collect()
to the fragment.
-
--callable
<fragment>
,
-c
<fragment>
¶ calls whatever the following fragment returns, with a single argument: the input value to the fragment.
-
--filter
<fragment>
,
-f
<fragment>
¶ filters the data stream, using the fragment as a predicate: if it returns any true value, the data passes through, but if it returns a false value
spy.DROP
is returned instead.
-
--many
<fragment>
,
-m
<fragment>
¶ calls
spy.many()
with the return value of the fragment (which must be iterable).
Exception handling¶
If your code raises an uncaught exception, spy will try to intercept and reformat the traceback, omitting the frames from spy’s own machinery. Special frames will be inserted where appropriate describing the fragment’s position, source code, and input data at the time the exception was raised:
$ spy 'None + 2'
Traceback (most recent call last):
Fragment 1
None + 2
input to fragment was <SpyFile stream=<_io.TextIOWrapper name='<stdin>' mode='r' encoding='UTF-8'>>
TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'
If an exception is raised in a decorator outside the call to the fragment body, the fragment is mentioned anyway. This is not strictly true, given that none of the code in the fragment takes part in the call stack in this case, but this particular lie is almost universally more useful:
$ spy -c None
Traceback (most recent call last):
Fragment 1, in decorator spy.decorators.callable
--callable 'None'
input to fragment was <SpyFile stream=<_io.TextIOWrapper name='<stdin>' mode='r' encoding='UTF-8'>>
File "/home/edk/src/spy/spy/decorators.py", line 44, in callable
return result(v)
TypeError: 'NoneType' object is not callable
The philosophy here is that what made it go wrong is more interesting than
exactly how it went wrong, so that’s what spy gives you by default. You can
get the real traceback by passing --no-exception-handling
to spy.
CLI reference¶
Regular options¶
-
--each-line
,
-l
¶
Process each line as its own string (rather than stdin as a file at once)
-
--no-default-fragments
¶
Don’t add any fragments to the chain that weren’t explicitly specified in the command line.
-
--no-exception-handling
¶
Disable spy’s exception handling and reformatting. This is mostly only useful for debugging changes to spy itself.
-
--pipe-name
=<name>
¶ Name the magic pipe variable
<name>
instead ofpipe
.
Output limiting options¶
The index arguments for these options refer to results, not input. If a single piece of input data results in 4 separate pieces of output, they’ll all count.
-
--start
=<index>
,
-s
<index>
¶ Start printing results at this zero-based index.
-
--end
=<index>
,
-e
<index>
¶ Stop processing data at this zero-based index.
Decorators¶
Decorator options must precede a code step. Multiple decorators can stack together. They have exactly the same effect as decorating a function in Python.
See the decorator API docs for a list of them.
Alternative actions¶
-
--help
,
-h
¶
Show usage and option descriptions.
-
--show-fragments
¶
Print out a list of string representations of the complete fragment chain that would be executed.
spy from Python¶
The introduction showed how to use spy from the command line. That’s not the only way: spy works just as well from other Python code. The CLI is just a wrapper around spy’s public API to make it easier to get to.
I don’t think it is useful in very many cases as a Python library, but if you want to create an alternative command-line interface for example, this may be of interest.
API documentation is available. What follows is a (very) brief guide, which I hope to expand on in the future.
Basic usage¶
As with the CLI, you create fragments and then pass data through them. And, as
with the CLI, creating fragments is easy. You decorate a regular function with
spy.fragment()
:
import spy
@spy.fragment
def add_five(v):
return v + 5
So, on to the feeding data part. You don’t feed data to fragments on their own, but to chains, so let’s create one:
chain = spy.chain([add_five])
In order to feed data into it, call the chain object with an iterable to feed into the chain. The call will return an iterable of the results:
data = [1, 2, 3, 4]
print(list(chain(data))) # [6, 7, 8, 9]
These iterators don’t interfere with each other, even if they’re created by the same chain object, so one chain can be used to process multiple independent sets of input data.
Differences from the CLI¶
As documented, collect()
takes a context
argument. It can be omitted
when using the CLI because it’s automatically filled in (it has to be, since
there’s no way to access the context object from CLI fragments). There is no
equivalent mechanism outside the CLI, so if you want to use collect()
, you
must provide context
. You can get the context object by accepting a
context
argument in your fragment function:
@spy.fragment
def foo(v, context):
c = spy.collect(context)
# do stuff with c
API¶
Detailed documentation for spy’s API.
spy
¶
This module exposes spy’s core API.
See also
spy.decorators
- Function decorators for use with spy fragments
Constants¶
Exceptions¶
Functions¶
-
class
spy.
catch
¶ A context manager. Exceptions raised in the context will be subject to spy’s traceback formatting and wrapped in a
CaughtException
. If these are not caught, spy uses an exception hook to force them to be formatted properly. If you opt to catchCaughtException
instead, you can use itsprint_traceback()
method to print the formatted traceback without exiting.
-
class
spy.
chain
(seq)¶ Construct a chain of fragments from
seq
.Parameters: seq (sequence) – Fragments to chain together -
apply
(data)¶ Feed
data
into the fragment chain, and return an iterator over the resulting data.
-
classmethod
auto_fragments
(seq)¶ Like the regular constructor, but for each element in
seq
, applyfragment()
to it if it isn’t already a fragment.Items in seq must be either regular functions (not generators) or fragments.
-
-
spy.
collect
(context)¶ Return an iterator of the elements being processed by the current fragment. Can be used to write a fragment that consumes multiple items.
-
@
spy.
fragment
¶ Given a callable
func
, return a fragment that callsfunc
to process data.func
must take at least one positional argument, a single value to process and return.Optionally it can take another argument, called
context
. If it does, a context object will be passed to it on each invocation. This object has no documented public functionality; its purpose is to be passed to spy API functions that require it (namelycollect()
).
-
spy.
many
(ita)¶ Return a signaling object that instructs spy to yield values from
ita
from the current fragment, instead of yielding only one value.
spy.decorators
¶
This module contains various function decorators for use in spy fragments.
-
@
spy.decorators.
accumulate
¶ -
--accumulate
,
-a
¶
Accumulate values into an iterator by calling
spy.collect()
, and pass that to the fragment.This can be used to write a fragment which executes at most once while passing data through:
-ma 'x = y;'
-
-
@
spy.decorators.
filter
¶ -
--filter
,
-f
¶
Use the decorated fragment as a predicate—only elements for which the fragment returns a true value will be passed through.
-
-
@
spy.decorators.
many
¶ -
--many
,
-m
¶
Call
spy.many()
with the result of the fragment.-
Examples¶
Sort¶
$ spy -mc sorted < test.txt
file
five
has
lines
this
Filter¶
$ spy -l -f 'len(pipe) == 4' < test.txt
this
file
five
Enumerate¶
Naively:
$ spy -m "['{}: {}'.format(n, v) for n, v in enumerate(pipe, 1)]" < test.txt
1: this
2: file
3: has
4: five
5: lines
Taking advantage of spy piping:
$ spy -m 'enumerate(pipe, 1)' "'{}: {}'.format(*pipe)" < test.txt
1: this
2: file
3: has
4: five
5: lines
Convert CSV to JSON¶
$ spy -c csv.DictReader -c list -c json.dumps < thing.csv > thing.json
Glossary¶
- fragment
An object which can be used by
spy.chain()
to create chained iterators.The following kinds of object only are considered fragments:
- The return value of a successful call to
spy.fragment()
- A generator taking exactly one argument, the iterable to get input values from.
Note
In any given version of spy, it’s possible that other objects may work as fragments. This is not part of the API, and any accidental support for using other objects may go away at any time.
- The return value of a successful call to