Nesting: Setting up your Python Environment

When you begin using Python a lot, you realize you’re creating (or re-creating!) the same functions over and over again.  I don’t know how many times, for example, I have re-Googled and re-typed the recipe from the magnificent and eccentric Python library BeautifulSoup to get text out of an HTML document, as when I want to create a simple search index.

I also end up needing to find this meta-recipe for loading my own frequently-used functions and helpers into my Python environment. So I thought I’d write up, in one place, how to set up your Python environment so that your own libraries and code snippets are in scope, and ready for you to use. It’s simple and I really wanted to get it down.

Feathering the Python nest

What we want to end up with is a Python environment in which tools and script we have already written are already loaded and at our fingertips. Start Python—or iPython, or some other Python-based environment—and your stuff is there.

In my environment I have created a module called brownhen in which functions I reuse are defined. You can call yours  anything you want—bobby, greatstuff, powertools, whatever. But it’s important to “name space” these functions and utilities to remind yourself where these things come from and what package they’re a part of. In this tutorial, my stuff is in a module called “brownhen”.

So this recipe describes how to:

  1. Organize your useful stuff into modules—a very good idea in any case
  2. Set up your environment to find your modules when Python starts (PYTHONPATH)
  3. Tell your environment not only where your modules are, but to pre-load them when you fire Python up (PYTHONSTART)

When I start up the iPython interpreter, which I keep up most all the time and sort of live in, my environment welcomes me and lets me know that my imports are working:

When Python tells me that it has 2) found and 3) loaded utilities I have 1) organized into my own module(s), it means that I can easily reuse my own.

PYTHONPATH

Where does Python look for the libraries and code it needs when it starts up? Where can it look for new stuff when you use import statements? It doesn’t just look everywhere on your laptop. That’s bad form and takes too long. 

The answer is that Python reads the system’s PYTHONPATH environment variable for any paths that should be added to what it already knows about, which typically gets defined when you install Python for the first ime. I am using the Anaconda distribution of Python right now, so the scope of Python’s searching is basically contained to the directory where Anaconda put down a series of executables, libraries, and tools:

Where does Python look for the libraries and code it needs when it starts up? Where can it look for new stuff when you use import statements? It doesn’t just look everywhere on your laptop. That’s bad form and takes too long.

The answer is that Python reads the system’s PYTHONPATH environment variable for any paths that should be added to what it already knows about, which typically gets defined when you install Python for the first ime. I am using the Anaconda distribution of Python right now, so the scope of Python’s searching is basically contained to the directory where Anaconda put down a series of executables, libraries, and tools:

To add to the list of places that Python searches when you start up, so that import statements can work without a hitch, edit the system PYTHONPATH variable and tell Python where else it should look:

Add the parent directory of “brownhen”, or “bobby”, to PYTHONPATH by editing your .bash_profie file on Mac, your .bashrc file on Linux, or your system environment variables on Windows:

If you put your stuff in a directory like ~/Dropbox/Programming/packages/brownhen/, then put the following (on Mac) into the file ~/.bash_profile:

This tells Python to look there when it starts up.  Already, you’ve set things up so that you can reach your brownhen stuff with import statements:

PYTHONSTART

Go one step further by not only telling Python where your stuff is, but asking Python to load it when it starts up. Use the PYTHONSTART environment variable for this:

This tells Python that there’s a particular script that should be executed when Python starts up. In this case, the script has one line, and that is the import statement that pulls the brownhen fabulousness in and makes it available. This is pystart.py in its entirety:

This tells Python to load the brownhen module, which can define functions and execute things right off the bat if you want. For this tutorial, my brownhen module file (see below) looks like this:

See that this file:

  • Imports a library—BeautifulSoup
  • Prints a statement to output to let you know it’s been loaded 
  • Defines two functions, one a test (hello) and the other we want to make sure and have available in all our Python sessions (textify)

With things set up in this way, I can start Python and begin to use my textify() function right away:

It works! I’ve got my environment set up so that frequently-used functions are 1) organized into a module, which 2) Python can find, and which is 3) loaded when I start up. Power. 

But how does this work? What’s a module again?

Python modules and module loading

The Python docs on this are good, and there’s lots of online support about modules and packages in Python, but just a word about it to round things out here:

Any directory in your path—in the list of directories you tell Python it should search via PYTHONPATH—is understood to be not just a directory but a Python module when it has a Python file in it called __init__.py (two underscores on either side of “init”):

This file __init__.py registers the directory “brownhen” as a module in Python, but it also gets executed as that module is loaded. So if in your PYTHONSTART file you import brownhen, as I have, you execute this file. 

In your module file, you can set define subdirectories as packages and do lots of other things, but know that this module initialization file gets execute by Python when the module is loaded, and is where you can put your favorite functions and tools.