Moving to Wyoming

Steve Howard
steve@thumbtack.com

12/12/2013

Two important concepts

  1. Composition: a simple form of modularity
  2. Dependency injection: a style of encoding relationships between objects

What's all the fuss about?

Alan Snyder

Encapsulation has many advantages in terms of improving the understandability of programs and facilitating program modification. Unfortunately, in most object-oriented languages, the introduction of inheritance severely compromises encapsulation.

From Encapsulation and Inheritance in Object-Oriented Programming Languages, 1986, http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.36.8552&rep=rep1&type=pdf.

Joshua Bloch

To avoid this fragility, use composition and forwarding instead of inheritance... Not only are wrapper classes more robust than subclasses, they are also more powerful.

public class InstrumentedHashSet extends HashSet {
    ...
    public boolean add(Object o) {
        addCount++;
        return super.add(o);
    }
    public boolean addAll(Collection c) {
        addCount += c.size();
        return super.addAll(c);
    }
    public int getAddCount() {
        return addCount;
    }
}

From Effective Java, 2001, Addison-Wesley.

Rob Pike

We argue that this compositional style of system construction has been neglected by the languages that push for design by type hierarchy. Type hierarchies result in brittle code... Go therefore encourages composition over inheritance, using simple, often one-method interfaces to define trivial behaviors that serve as clean, comprehensible boundaries between components.

From Go at Google: Language Design in the Service of Software Engineering, 2012, http://talks.golang.org/2012/splash.article

Marius Eriksen

Use dependency injection for program modularization, and in particular, prefer composition over inheritance — for this leads to more modular and testable programs. When encountering a situation requiring inheritance, ask yourself: how would you structure the program if the language lacked support for inheritance? The answer may be compelling.

From Effective Scala, 2012, http://twitter.github.io/effectivescala

Inheritance and Composition

Inheritance

Composition

The conceptual view

(What they teach you in school)

The operational view

(What you figure out when it's too late)

Python Thread documentation

There are two ways to specify the activity: by passing a callable object to the constructor, or by overriding the run() method in a subclass. No other methods (except for the constructor) should be overridden in a subclass. In other words, only override the __init__() and run() methods of this class.

If the subclass overrides the constructor, it must make sure to invoke the base class constructor (Thread.__init__()) before doing anything else to the thread.

http://docs.python.org/2/library/threading.html#thread-objects

Where do these function calls go?

In a system that uses inheritance extensively:

class EventTrackingClient(JsonHttpClient):
    ...

    def send(self, data):
        message = self.build_message(data)
        self.send_message(message)

    ...

In a system that avoids inheritance:

class EventTrackingClient(object):
    ...

    def send(self, data):
        message = self.build_message(data)
        self.json_http_client.send_message(message)

    ...

Coupling

Composition leads to looser coupling than inheritance

Dependency management is really important

Ways to manage object dependencies

  1. Static cling
    • http://googletesting.blogspot.com/2008/06/defeat-static-cling.html
  2. Singleton
  3. Dependency injection

Our example case

my_app.py

class MyApp(object):
    def get_response(self, request):
        return Response(
            somehow render('hello.html',
                           name=request.args['name'])
        )

Static cling

templates.py

import jinja2

jinja_environment = jinja2.Environment(
    loader=jinja2.PackageLoader('my_package', 'templates'))

def render(template_name, **context):
    return (jinja_environment.get_template(template_name)
            .render(**context))

my_app.py

import templates

class MyApp(object):
    def get_response(self, request):
        return Response(
            templates.render('hello.html',
                             name=request.args['name']))

main.py

import my_app

def main():
    app = my_app.MyApp()
    ...

Singleton

templates.py

import jinja2

class TemplateManager(object):
    def __init__(self):
        self._environment = jinja2.Environment(loader=...)

    def render(self, template_name, **context):
        return (self._environment.get_template(template_name)
                .render(**context))

manager = TemplateManager()

my_app.py

import templates

class MyApp(object):
    def get_response(self, request):
        return Response(templates.manager.render(
            'hello.html', name=request.args['name']))

main.py

import my_app

def main():
    app = my_app.MyApp()
    ...

Dependency injection

templates.py

class TemplateManager(object):
    def __init__(self, jinja_environment):
        self._environment = jinja_environment

    def render(self, template_name, **context):
        return (self._environment.get_template(template_name)
                .render(**context))

my_app.py

class MyApp(object):
    def __init__(self, template_manager):
        self._template_manager = template_manager

    def get_response(self, request):
        return Response(
            self._template_manager.render(
                'hello.html', name=request.args['name']))

main.py

import jinja2, my_app, templates

def main():
    jinja_environment = jinja2.Environment(
        loader=jinja2.PackageLoader('my_package', 'templates'))
    template_manager = (
        templates.TemplateManager(jinja_environment))
    app = my_app.MyApp(template_manager)
    ...

Composition > Inheritance

DI >> Singleton >> Static cling

Composition and DI play nicely together

Readability

What does TransactionHelper do?

class TransactionHelper(object):
    def __init__(self):
        ...

How about now?

class TransactionHelper(object):
    def __init__(self, user_manager, credit_card_processor, shopping_cart):
        ...

From earlier:

class EventTrackingClient(object):
    def send(self, data):
        message = self.build_message(data)
        self.json_http_client.send_message(message)

Extensibility

Reusable wrappers

set1 = InstrumentedSet(set())
set2 = InstrumentedSet(TreeSet())
set3 = InstrumentedSet(sparse_integer_set)

Temporary wrappers on existing objects

def do_stuff(some_set):
    instrumented_set = InstrumentedSet(some_set)
    # ... use instrumented_set in this method

(Examples from Effective Java)

Testability

Without DI:

def test_save(self):
    user = User(email, password)
    user.save()
$ ./test_models.py
ERROR: test_save (__main__.UserTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_models.py", line 10, in test_save
    user.save()
  File "models.py", line 64, in save
    db = database.get_database()
  File "database.py", line 103, in get_database
    connection = psycopg2.connect(host=DEFAULT_HOST)
OperationalError: could not translate host name "master.postgres.internal"
to address: Name or service not known

With DI:

def test_save(self):
    database = FakeDatabase()
    user = User(email, password)
    user.save(database)

Debugging

Coarse-grained:

import database
database.enable_debug_logging = True

Fine-grained:

def construct_app(configuration):
    database = Database(configuration.database_credentials)
    user_manager = UserManager(database)
    transaction_manager = TransactionManager(
        LoggingDatabaseWrapper(database)
    )

Improved design

Without DI: looks perfectly innocent...

class PipelineHandler(object):
    def __init__(self):
        ...

With DI: the guilty are exposed!

class PipelineHandler(object):
    def __init__(self, robots_txt_cache, shard_manager, crawl_queue,
                 writer_factory, queue_snapshotter, dns_cache, link_sender,
                 system_facade, stat_keeper, crawl_depth, retain_input_fields,
                 result_upload_size, add_seeds_to_due):
        ...

Movin' out west

From static cling to singleton

Step 1: Make one shared place for all external dependencies

wyoming.py

# The SQLAlchemy Session class
session_factory = None

# The templates.TemplateManager
template_manager = None

...

From static cling to singleton

Step 2: Initialize that shared place when your app starts

main.py

import my_app
import templates
import wyoming

def initialize_wyoming():
    engine = sqlalchemy.create_engine(...)
    models.Base.metadata.create_all(engine)
    wyoming.session_factory = (
        sqlalchemy.orm.sessionmaker(bind=engine))

    jinja_environment = jinja2.Environment(
        loader=jinja2.PackageLoader('my_package', 'templates'))
    wyoming.template_manager = (
        templates.TemplateManager(jinja_environment))

    ...

def main():
    initialize_wyoming()
    app = my_app.MyApp()
    ...

From static cling to singleton

Step 3: Update all references in your app to point to wyoming

my_app.py

import wyoming

class MyApp(object):
    def get_response(self, request):
        session = wyoming.session_factory()
        results = session.query(...)
        return Response(
            wyoming.template_manager.render(
                'hello.html', name=request.args['name'],
                results=results))

Keep newly-created dependencies out of wyoming!

From singleton to dependency injection

Step 1: Move global access to constructors

my_app.py

import wyoming

class MyApp(object):
    def __init__(self):
        self._session_factory = wyoming.session_factory
        self._template_manager = wyoming.template_manager

    def get_response(self, request):
        session = self._session_factory()
        results = session.query(...)
        return Response(
            self._template_manager.render(
                'hello.html', name=request.args['name'],
                results=results))

From singleton to dependency injection

Step 2: Move global access to instantiation sites

my_app.py

class MyApp(object):
    def __init__(self, session_factory, template_manager):
        self._session_factory = session_factory
        self._template_manager = template_manager

    def get_response(self, request):
        session = self._session_factory()
        results = session.query(...)
        return Response(
            self._template_manager.render(
                'hello.html', name=request.args['name'],
                results=results))

main.py

import my_app
import wyoming

def initialize_wyoming(): ...

def main():
    initialize_wyoming()
    app = my_app.MyApp(
        wyoming.session_factory, wyoming.template_manager)
    ...

From singleton to dependency injection

Step 3: Notice you no longer need wyoming

main.py

def main():
    engine = sqlalchemy.create_engine(...)
    models.Base.metadata.create_all(engine)
    session_factory = (
        sqlalchemy.orm.sessionmaker(bind=engine))

    jinja_environment = jinja2.Environment(
        loader=jinja2.PackageLoader('my_package', 'templates'))
    template_manager = (
        templates.TemplateManager(jinja_environment))

    app = my_app.MyApp(session_factory, template_manager)

Farewell, wyoming!

Standing on the shoulders of giants

Based on the work of...

Thanks for the feedback from...

Questions?

steve@thumbtack.com