Tatoo Documentation Release 0.7.0 - Dmitry Malinovsky

Page created by Edna Hopkins
 
CONTINUE READING
Tatoo Documentation
            Release 0.7.0

       Dmitry Malinovsky

                March 04, 2015
Contents

1   Installation                                                                                                      3

2   Documentation Contents                                                                                            5
    2.1 Why tatoo? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    5
    2.2 Hello, world! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   6
    2.3 User Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    7

                                                                                                                      i
ii
Tatoo Documentation, Release 0.7.0

Tatoo is extensible task toolkit.
    • No global state,
    • The only hardcoded behaviour is to be configurable,
    • Extremely extensible.
It aims to provide a generic way to inspect the runtime environment such as platform specifics (e.g. OS), current
configuration parameters, registered tasks and so forth, and to be configurable as much as possible without any sort of
monkey patching. Although such configurability may be painful, tatoo provides sensible defaults.

Contents                                                                                                             1
Tatoo Documentation, Release 0.7.0

2                                    Contents
CHAPTER 1

                                            Installation

Grab the latest stable version from PyPI:
$ pip install tatoo

                                                       3
Tatoo Documentation, Release 0.7.0

4                                    Chapter 1. Installation
CHAPTER 2

                                                                            Documentation Contents

2.1 Why tatoo?

The answer is very simple: the lack of powerful yet simple task execution toolkits. It tries to bring the awesomeness
of Celery to the local task execution.
Here are some core features tatoo provides:
    • laziness, almost everything is created on demand,
    • extreme configurability,
    • extensibility,
    • beautiful programming api,
    • it works the same in Python 2.6+, 3.3+ and Pypy,
    • but it is written with Python 3 in mind, so it uses a lot of Python 3 features, carefully backported to Python 2,
    • it is tested continuously on Linux, Windows and MacOS,
    • it doesn’t try to reinvent the wheel, requirements have been chosen carefully.
Of course, there are alternatives to tatoo and you can use them if they’re more suitable for you.
There are two features commonly missing in all listed libraries:
    • configurability and
    • extensibility.
There are a lot of use-cases when you will need to subclass the base Task class, for example, to provide specific
method. How to tell the library to use your custom subclass instead of base class? Or how to add a custom command
to the command line interface?
The answer is “you can’t”, or “monkey-patch”.
Tatoo is written to be configurable and extensible as much as possible. Cases listed above are “easy level” of tatoo
configuring abilities. You can even extend it to call tasks remotely!

2.1.1 Why not Pyinvoke?

The programming interface that tatoo provides looks very similar to the Pyinvoke’s. However, Pyinvoke makes a lot
of assumptions of how to handle task arguments. For instance, it generates command line arguments and options
automatically from the function definition and automatically defines types of arguments with default values (using

                                                                                                                          5
Tatoo Documentation, Release 0.7.0

inspect module). The upside of this is that you don’t need to write any code to make your task available from the
command line, keyword arguments are cased to corresponding types automatically. The downside is you can’t really
control this process. You can’t specify arguments and options explicitly. You can’t define complex types (e.g. File
type). Automatic generation of options makes your command line interface to depend on the task signature - imagine
that you want to change the order of arguments, this will cause Pyinvoke to generate options differently.
Pyinvoke brings unnecessary concepts of pre- and post-tasks, deduplication, namespaces, contexts, standard tasks,
contextualized tasks and so on. This makes Pyinvoke difficult to learn and use.

2.1.2 Why not Doit, Shovel, Paver and others?

Other libraries has similar concerns, often they’re too specialized. Some libraries has very verbose syntax for defining
tasks, others are too simplified. Also, it seems that these libraries were designed as Make for python.
Tatoo is not just yet-another-Make, although it can be used that way. The aim of tatoo is to provide simple, but
extensible interface to call tasks and expect the possibility to add additional behaviors, so you can start growing from
a very simple task to a number of complex tasks calling each other without days spent on reading the documentation
and the source code figuring out how to make a small step aside.

2.2 Hello, world!

The first thing you need is an Environment instance. Environment is basically a container for all runtime parame-
ters, configuration, tasks and so forth:
from tatoo import Environment
env = Environment(’myenv’)

Each environment instance should have a name. This helps to identify the currently used environment if there are
multiple environments are defined.
Once you have an environment instance, you can transform a function into a task by decorating it through task()
decorator:
@env.task
def hello():
    print(’Hello, world!’)

Putting all together in tasks.py file:
from tatoo import Environment

env = Environment()

@env.task
def hello():
    print(’Hello, world!’)

Executing the task is simple as:
>>> from tasks import hello
>>> hello.apply() # prints Hello, world!

6                                                                          Chapter 2. Documentation Contents
Tatoo Documentation, Release 0.7.0

2.3 User Guide

2.3.1 Environment

Tatoo uses a central object called the runtime Environment. Instances of this class are used to store the configuration
and global objects, for example, tasks.
The simplest way to create an environment instance looks like this:
from tatoo import Environment
env = Environment()

Environment instances are thread safe (they do not share any global state) so that multiple environments with different
settings, tasks and extensions can co-exist in the same process space.

Environment Name

To make it easier to identify which environment is currently used, you should specify names for environments:
env = Environment(’myenv’)

Environment name will be included in logs like this:
[2015-01-29 00:42:23,206: INFO | env: myenv] Task add [...] succeeded in 0.000219s: 3

Settings

Sometimes you will need to make your application configurable, which basically means that you will need a globally
accessible configuration storage. There is one you can use:
env.settings[’SOMEKEY’] = ’somevalue’

Although it provides mapping interface, you can also use attribute access:
print(env.settings.SOMEKEY)            # somevalue

Several keys may be updated at once using update() method:
env.settings.update(
    ONEKEY=True,
    TWOKEY=False,
)

New default values can be added using add_defaults() method:
env.settings.add_defaults(OTHERKEY=False)

Censored Settings

If you ever want to print out the configuration, as debugging information or similar, you may also want to filter out
sensitive information like passwords and API keys using humanize() method:
env.settings.humanize()

If you want to get the humanized dictionary instead, consider using table() method:

2.3. User Guide                                                                                                      7
Tatoo Documentation, Release 0.7.0

env.setting.table()

Please note that tatoo will not be able to remove all sensitive information, as it merely uses a regular expression to
search for commonly named keys. If you add custom settings containing sensitive information you should name the
keys using a name that Celery identifies as secret.
A configuration setting will be censored if the name contains any of these substrings:
API, TOKEN, KEY, SECRET, PASS, SIGNATURE, DATABASE

Breaking The Chain

This practice is named “object chain” or, specifically, “env chain”. The idea is to pass env instance explicitly to every
single object that needs it instead of having a global variable.
To become a chain link, a class should conform to the following rules:
    1. One must be able to set env attribute directly, and
    2. __init__ must accept env argument.
Here is the example of class which follows the rules:
class EnvCompatibleClass(object):
    env = None
    def __init__(self, env=None):
        self.env = env or self.env

This approach makes possible to not have a shared global env registry, env attribute should be accessed instead.

Environment as Registry of Objects

Note: This section requires understanding the ZCA concepts and is intended for developers. If you’re a end-user, you
can freely skip this part.

Each environment maintains its own local                     component    registry.        In    fact,   it   subclasses
zope.interface.registry.Components.
You can register any custom interface provider using common methods like registerUtility(). For example,
you can register custom tasks registry like this:
from   tatoo import Environment
from   zope.interface import implementer_only
from   tatoo.interfaces import ITaskRegistry
from   tatoo.tasks import TaskRegistry

@implementer_only(ITaskRegistry)
class MyTaskRegistry(TaskRegistry):
    pass

myregistry = MyTaskRegistry()
env = Environment()
env.registerUtility(myregistry)

To keep examples short, we subclass TaskRegistry directly. However, it is only required to conform
ITaskRegistry interface, so you may write your custom implementation from the scratch.

8                                                                           Chapter 2. Documentation Contents
Tatoo Documentation, Release 0.7.0

This rule generally applies to every single object - you can find all interfaces that tatoo defines in
tatoo.interfaces.

2.3.2 Tasks

Tasks are callables with some helper wrappers around them.
To define a task, you need two things:
    • an environment instance
    • and a callable.
Then you can easily create task from this callable using env.task decorator:
from tatoo import Environment

env = Environment(__name__)

@env.task
def hello():
    print(’Hello, world!’)

Tatoo will create a specially formed class and instantiate it in place, and the callable will be swapped with this instance.

Names

Every task must be associated with a unique name. You can specify one using name argument:
@env.task(name=’world’)
def hello():
    print(’Hello, world!’)

You can also specify a factory accepting two arguments: the environment instance and the wrapped callable:
@env.task(name=lambda env, fun: fun.__name__)
def hello():
    print(’Hello, world!’)

If you omit name, it will be automatically generated. The current behavior is to simply take the callable’s __name__.
You can tell the name of the task by investigating its name attribute:
>>> hello.name
’hello’

Names can be used to get tasks from the task registry:
>>> env.tasks[’hello’]

Parameters

As normal python callables, tasks may have arguments. In the simplest case you can declare a parametrized task like
this:

2.3. User Guide                                                                                                           9
Tatoo Documentation, Release 0.7.0

@env.task
def copy(src, dst, recursive=False):
    print(’Copying {src} to {dst}{postfix}’.format(
          src=src, dst=dst, postfix=’ recursively’ if recursive else ’’
    ))

However, there are problems with this approach. One of them is that you often want to somehow validate incoming
values. Given the example above,
     • src must be a string representing a path, this path must exist, it must be resolvable (i.e. symlinks must be
       resolved) and it must be readable,
     • dst, similarly to src, must be a string representing a path, it may not exist, but it must be writable,
     • recursive argument acts like a flag and must be bool,
     • finally, if recursive is False, src must not be a directory.
Of course, you can manually validate arguments in the task body, but it will make your task a little harder to understand
and to maintain. It will be difficult to say which rules must be followed for a valid argument.
Another problem is that if you want to build a command-line interface, and call your tasks with it, there is no trans-
parent rule to map arguments defined in the python code to the command-line arguments. Let’s start with the obvious
one:
cli copy -r/--recursive SRC DST

However, there is no restriction to make this:
cli copy SRC DST [RECURSIVE]

It become even more ambiguous when you have options with parameters, like:
@env.task
def copy(src, dst, recursive=False, backup=None):
    if backup not None:
        backup=’ [backing up to {0}]’.format(backup)
    print(’Copying {src} to {dst}{postfix}{backup}’.format(
          src=src, dst=dst, rec=’ recursively’ if recursive else ’’,
          backup=backup or ’’,
    ))

Now it is completely impossible to programmatically say, should backup be:
     • a boolean flag, or
     • an option accepting a path, or
     • an optional argument.
There is one more argument which makes the idea to automatically generate command-line options and arguments
from the task signature impracticable. Consider the following example:
@env.task
def echo(message, options=None, outfile=None):
    file = sys.stderr if outfile is None else outfile
    if options is not None:
        if options[’colored’]:
            message = colored(message)
    print(message, file=file)

For options argument we generate short flat -o. For outfile we can’t use -o because it is taken already, so we
take the next letter - -u. The software evolves, and after a while you decided to swap outfile and options in the

10                                                                          Chapter 2. Documentation Contents
Tatoo Documentation, Release 0.7.0

task signature - and now all your scripts are broken because -o flag will now be used for outfile, options will
use -p and -u flag is gone.
Tatoo solves all these problems with parameter() decorator:
from tatoo import parameter
from tatoo.task import types

def validate_path(path, arguments):
    if not arguments[’recursive’]:
        path.dir_ok = False
        path(arguments[’src’])

@env.task
@parameter(’SOURCE’, ’src’,
           type=types.Path(exists=True, file_ok=True, dir_ok=True,
                           readable=True, resolve_path=True,
                           validator=validate_path))
@parameter(’DEST’, ’dst’, type=types.Path(writable=True))
@parameter(’-R’, ’-r’, ’--recursive’,
           help=’Copy directories recursively’, is_flag=True)
def copy(src, dst, recursive):
    """Copy SOURCE to DEST."""

Now you can unambiguously say that src can be mapped to SOURCE command-line argument, dst - to DEST and
that recursive is a boolean flag that can be specified as short -R, as another short -r and as long --recursive
options. All basic validation happens in types module, and you can specify additional validation using validator
argument as shown in the example above.
You can inspect all registered parameters via parameters attribute.

Note: Tatoo itself does not provide a command-line interface. But it can be implemented as external package, and
tatoo must provide some way to unambiguously create command-line interfaces. It is also useful without involving
a cli - for example, tatoo performs type checks before running tasks when TATOO_VALIDATE_PARAMS setting is
True.

Execution & Results

To execute a task, method apply() should be used:
>>>   from tatoo import Environment
>>>   env = Environment(’test’)
>>>   @env.task
...   def add(x, y):
...       return x + y
...
>>>   res = add.apply(args=(1, 2))

Note the returned value is instance of EagerResult class:
>>> res

It is a convinient result wrapper allows you to inspect various metrics:
>>> res.result
3

2.3. User Guide                                                                                              11
Tatoo Documentation, Release 0.7.0

>>> res.state
’SUCCESS’
>>> res.runtime
4.540802910923958e-05

Let’s make our task to raise TypeError exception:
>>> res = add.apply(args=(1, ’2’))
>>> res.failed()
True
>>> res.result
TypeError("unsupported operand type(s) for +: ’int’ and ’str’",)
>>> print(res.traceback)
Traceback (most recent call last):
  File "/home//tatoo/task/trace.py", line 107, in trace_task
    **request[’kwargs’])
  File "", line 3, in add
TypeError: unsupported operand type(s) for +: ’int’ and ’str’

>>> res.state
’FAILURE’

You can also propagate exceptions like this:
>>> res = add.apply(args=(1, ’2’))
>>> res.get()
Traceback (most recent call last):
  File "", line 1, in 
  File "/home//tatoo/task/result.py", line 41, in                     get
    self.maybe_reraise()
  File "/home//tatoo/task/result.py", line 58, in                     maybe_reraise
    raise self.result
  File "/home//tatoo/task/trace.py", line 107, in                     trace_task
    **request[’kwargs’])
  File "/home//test.py", line 8, in add
    return x + y
TypeError: unsupported operand type(s) for +: ’int’ and                     ’str’

Also apply() accepts a number of extra parameters which form the execution request:
>>> res = add.apply(args=(1,), kwargs={’y’: 2}, request_id=’someid’)
>>> res.result
3
>>> res.request_id
’someid’

2.3.3 Extensions

Tatoo follows the microservice architecture by dividing different components into third party packages. It allows tatoo
itself to be small, simple and testable.
It is very easy to write custom extensions. Let’s create one in maths.py file:
from tatoo.extension import Extension

ext = Extension(__name__, version=’0.1.0’)

@ext.task

12                                                                        Chapter 2. Documentation Contents
Tatoo Documentation, Release 0.7.0

def add(x, y):
    return x + y

@ext.task
def sub(x, y):
    return x - y

@ext.task
def mul(x, y):
    return x * y

@ext.task
def div(x, y):
    return x / y

 Task names
 In the task names section you learned that each task has a unique name, and by default it’s generated from the
 callable’s name. It’s not true for tasks defined in extensions. Default behavior here is to prepend the extension
 name (and dot), so that add task defined in math extension gets math.add name by default.

You will also need a simple setup.py:
from setuptools import setup

setup(
    name=’math-tasks’,
    version=’0.1.0’,
    py_modules=[’maths’],
    entry_points="""
    [tatoo.extensions]
    maths = maths:ext
    """
)

Now you’re able to install this extension using pip:
$ pip install .

Let’s make sure that our math extension is loaded and is usable:
>>> from tatoo import Environment
>>> env = Environment(’testenv’)
>>> print(list(env.extensions))
[’math’]
>>> res = env.tasks[’math.add’].apply(args=(1, 2))
>>> res.result
3

2.3.4 Logging

TODO

2.3. User Guide                                                                                                      13
Tatoo Documentation, Release 0.7.0

2.3.5 Signals

TODO

14                                   Chapter 2. Documentation Contents
You can also read