Python Environment Management
Setting up a consistent and reproducible Python environment is crucial for conducting reproducible research. This guide covers four popular Python environment management approaches, organized by tool for easy navigation.
The native Python approach using pip (Python's package installer) and venv (Python's built-in virtual environment module). This is the most basic and widely supported method.
Prerequisites
- Python 3.3+ (venv is included in the standard library)
- pip (usually comes with Python installations)
Creating a Virtual Environment
-
Navigate to your project directory:
cd /path/to/your/project -
Create a virtual environment:
python -m venv myresearchOr specify a Python version (if you have multiple versions):
python3.9 -m venv myresearch -
Activate the environment:
- Unix/macOS:
source myresearch/bin/activate - Windows:
myresearch\Scripts\activate
- Unix/macOS:
-
Verify activation (you should see the environment name in your prompt):
which python # Unix/macOS where python # Windows -
Upgrade pip (recommended):
python -m pip install --upgrade pip
Installing Packages
With the virtual environment activated:
# Install individual packages
pip install numpy pandas matplotlib scipy
# Install from requirements file
pip install -r requirements.txt
# Install in development mode (for your own package)
pip install -e .
Managing Dependencies
Creating Requirements Files
Generate a requirements file with exact versions:
pip freeze > requirements.txt
Create a more flexible requirements file manually:
# requirements.txt
numpy>=1.21.0
pandas>=1.3.0
matplotlib>=3.4.0
scipy>=1.7.0
requests>=2.25.0
Development Dependencies
Keep development tools separate:
# requirements-dev.txt
pytest>=6.2.0
black>=21.0.0
flake8>=3.9.0
mypy>=0.910
jupyter>=1.0.0
Install development dependencies:
pip install -r requirements-dev.txt
Using pip-tools for Better Dependency Management
Install pip-tools for more sophisticated dependency management:
pip install pip-tools
Create a requirements.in file with high-level dependencies:
# requirements.in
numpy
pandas
matplotlib
scipy
requests
Generate a locked requirements file:
pip-compile requirements.in
This creates requirements.txt with pinned versions for reproducibility.
Deactivating and Removing Environments
Deactivate
deactivate
Remove Environment
Simply delete the environment directory:
rm -rf myresearch # Unix/macOS
rmdir /s myresearch # Windows
Project Structure Example
myresearch/
├── myresearch/ # Virtual environment (don't commit to git)
├── src/ # Source code
├── tests/ # Test files
├── data/ # Data files
├── notebooks/ # Jupyter notebooks
├── requirements.txt # Production dependencies
├── requirements-dev.txt # Development dependencies
├── README.md
└── .gitignore # Include myresearch/ here
When to Use pip + venv
- Learning Python or starting with Python development
- Working with pure Python packages
- Need maximum compatibility across systems
- Want to understand the basics of Python packaging
- Working in environments where conda/poetry aren't available
- Simple projects with straightforward dependencies
Advantages
- Built-in: No additional tools to install
- Universal: Works everywhere Python works
- Simple: Straightforward workflow
- Lightweight: Minimal overhead
- Educational: Helps understand Python packaging
Limitations
- Manual dependency resolution: No automatic conflict resolution
- No binary package management: Relies on PyPI wheels
- Basic environment management: Less sophisticated than alternatives
- No built-in project management: Requires manual setup
Miniconda is a popular distribution system for Python and R, designed for scientific computing. It simplifies package management and deployment.
Installation
Miniconda (Traditional)
- Visit the Miniconda website and download the appropriate installer.
- Run the installer and follow the prompts.
- Miniconda provides a base Python installation with conda package manager.
Mamba (Faster Alternative)
Mamba is a fast, robust, and cross-platform package manager that's a drop-in replacement for conda. It uses the same commands and configuration files as conda but is significantly faster.
Install Mamba:
# Option 1: Install mamba in base environment
conda install -n base -c conda-forge mamba
# Option 2: Install Mambaforge (recommended for new installations)
# Download from: https://github.com/conda-forge/miniforge#mambaforge
Creating a Conda Environment
After installing Miniconda or Mambaforge, create a new environment for your research project:
- Open a terminal (or Anaconda Prompt on Windows).
-
Create a new environment named
myresearchwith Python 3.9:Using conda:
conda create --name myresearch python=3.9Using mamba (faster):
mamba create --name myresearch python=3.9 -
Activate the environment:
conda activate myresearch # or mamba activate myresearch -
Install necessary packages:
Using conda:
conda install numpy pandas matplotlib scipyUsing mamba (faster):
mamba install numpy pandas matplotlib scipy
Managing Dependencies
Create an environment.yml file to track your dependencies:
name: myresearch
channels:
- conda-forge
- defaults
dependencies:
- python=3.9
- numpy
- pandas
- matplotlib
- scipy
- pip
- pip:
- some-pip-only-package
Recreate the environment from the file:
Using conda:
conda env create -f environment.yml
Using mamba (faster):
mamba env create -f environment.yml
Mamba vs Conda
Mamba advantages: - Speed: 5-10x faster package resolution and installation - Better error messages: More informative when conflicts occur - Parallel downloads: Downloads packages in parallel - Same interface: Drop-in replacement for conda commands - Better dependency solving: More robust solver algorithm
When to use Mamba: - Large environments with many packages - Frequent package installations/updates - Complex dependency resolution scenarios - When conda feels slow
Mamba command equivalents:
# Replace 'conda' with 'mamba' in any command
mamba install package_name
mamba create -n env_name python=3.9
mamba env export > environment.yml
mamba list
mamba search package_name
When to Use Conda/Mamba
- Working with scientific computing packages
- Need packages from multiple languages (Python, R, C++)
- Working with complex binary dependencies
- Need pre-compiled packages for performance
- Large environments with many packages (prefer mamba)
uv is an extremely fast Python package installer and resolver, written in Rust. It's designed as a drop-in replacement for pip and pip-tools.
Installation
Install uv using pip:
pip install uv
Or using curl (on Unix systems):
curl -LsSf https://astral.sh/uv/install.sh | sh
Creating a Virtual Environment
-
Create a new virtual environment:
uv venv myresearch -
Activate the environment:
- On Unix/macOS:
source myresearch/bin/activate - On Windows:
myresearch\Scripts\activate
- On Unix/macOS:
-
Install packages:
uv pip install numpy pandas matplotlib scipy
Managing Dependencies
Create a requirements.txt file:
uv pip freeze > requirements.txt
Install from requirements:
uv pip install -r requirements.txt
For development dependencies, use requirements-dev.txt:
uv pip install -r requirements-dev.txt
When to Use uv
- Want the fastest package installation
- Working primarily with pure Python packages
- Need a drop-in replacement for pip
- Want minimal overhead
Poetry is a tool for dependency management and packaging in Python. It allows you to declare the libraries your project depends on and manages them for you.
Installation
Install Poetry using the official installer:
curl -sSL https://install.python-poetry.org | python3 -
Or using pip:
pip install poetry
Creating a New Project
-
Create a new project:
poetry new myresearch cd myresearch -
Or initialize Poetry in an existing project:
cd myresearch poetry init
Managing Dependencies
-
Add dependencies:
poetry add numpy pandas matplotlib scipy -
Add development dependencies:
poetry add --group dev pytest black flake8 -
Install all dependencies:
poetry install -
Activate the virtual environment:
poetry shell
Configuration File
Poetry uses pyproject.toml to manage project configuration:
[tool.poetry]
name = "myresearch"
version = "0.1.0"
description = "My research project"
authors = ["Your Name <your.email@example.com>"]
[tool.poetry.dependencies]
python = "^3.9"
numpy = "^1.21.0"
pandas = "^1.3.0"
matplotlib = "^3.4.0"
scipy = "^1.7.0"
[tool.poetry.group.dev.dependencies]
pytest = "^6.2.0"
black = "^21.0.0"
flake8 = "^3.9.0"
[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"
When to Use Poetry
- Building distributable Python packages
- Need sophisticated dependency resolution
- Want integrated project management
- Prefer declarative dependency management
Best Practices
- Always use virtual environments to isolate project dependencies
- Pin dependency versions for reproducibility
- Document your environment setup in your project README
- Use lock files (
poetry.lock,conda-lock) when available - Keep development and production dependencies separate
- Regularly update dependencies while testing for compatibility
Reproducibility Tips
- Include environment files (
environment.yml,pyproject.toml,requirements.txt) in version control - Document the Python version and environment manager used
- Consider using Docker for ultimate reproducibility
- Test your environment setup on different machines