Using Stata in Quarto

Interactive notebooks with nbstata

Published

January 1, 2025

Standard Stata workflows separate code from documentation. You run .do files, check console output, save tables as .tex files. When exploring data or testing specifications, you end up with scattered output and unclear provenance.

Quarto with nbstata offers a different approach for research exploration: code, results, and narrative in self-contained HTML reports. Run regressions, generate tables, document findings - all in one file that you can share with collaborators or publish online.

This is particularly useful for:

This isn’t for journal submissions (those use LaTeX). It’s for the exploratory work that happens before you know what’s publication-ready.

Requirements: Stata 17+ (when Stata added Python integration), Python 3.7+, Quarto.

nbstata is a Jupyter kernel providing native Stata cells with syntax highlighting, inline code evaluation, and enhanced data exploration commands (%browse, %head, %tail).

Installation

You need Stata 17+, Python 3.7+, and Quarto (download here).

The key challenge: Quarto doesn’t automatically detect your Python environment. You need to tell it which Python installation has Jupyter. The cleanest approach: install nbstata where Jupyter already works, then point Quarto to that Python.

Step 1: Find your working Jupyter’s Python

which jupyter
head -1 $(which jupyter)
# Note the Python path in the shebang line

Step 2: Install nbstata in that environment

# Use the exact Python path from step 1
/path/to/your/jupyter/python -m pip install nbstata
/path/to/your/jupyter/python -m nbstata.install

Step 3: Tell Quarto to use this Python

export QUARTO_PYTHON="/path/to/your/jupyter/python"
quarto check jupyter  # Should show nbstata available

Example for Homebrew-managed Jupyter:

# Install where Jupyter lives
/opt/homebrew/Cellar/jupyterlab/4.4.4/libexec/bin/python -m pip install nbstata
/opt/homebrew/Cellar/jupyterlab/4.4.4/libexec/bin/python -m nbstata.install

# Configure Quarto
export QUARTO_PYTHON="/opt/homebrew/Cellar/jupyterlab/4.4.4/libexec/bin/python"

Make permanent (optional):

# Add to ~/.zshrc or ~/.bashrc
echo 'export QUARTO_PYTHON="/path/to/your/jupyter/python"' >> ~/.zshrc

Configure Stata Connection (if needed)

nbstata usually auto-detects Stata. If you get path errors, create ~/.config/nbstata/nbstata.conf:

[nbstata]
stata_dir = /Applications/Stata                     # macOS
# stata_dir = /Applications/StataNow                # macOS (StataNow)
# stata_dir = C:/Program Files/Stata18/             # Windows
# stata_dir = /usr/local/stata18/                   # Linux
edition = mp  # or 'se', 'be'

macOS note: Point to the directory containing the .app bundle, not the bundle itself.

Verify Installation

Check that Quarto sees both Jupyter and nbstata:

quarto check jupyter  # Should list nbstata

Test nbstata (in a .qmd file or Jupyter notebook):

%status

Working in VS Code

VS Code with the Quarto extension lets you run cells interactively without rendering the full document. Output appears in the Interactive Window panel.

Setup:

  1. Open .qmd file in VS Code
  2. Click kernel selector in top-right (might show “Python 3.x”)
  3. Select “Select Another Kernel” → “Jupyter Kernel” → “Stata (nbstata)”
  4. Run cells with Shift+Enter

Useful for exploratory work - iterate quickly on cells, render full document when ready.

Document Structure

Every Quarto document starts with a YAML header. Minimum configuration:

---
title: "Your Analysis"
jupyter: nbstata
format: html
---

Render to multiple formats from one source:

---
title: "Your Analysis"
jupyter: nbstata
format:
  html:
    toc: true
    code-fold: true
  pdf: default
  docx: default
---

Control execution behavior:

execute:
  warning: false    # Hide Stata warnings
  output: true      # Show output by default
  echo: true        # Show code by default
  cache: false      # Set true for expensive computations

For sharing HTML files, use self-contained: true so recipients don’t need separate resources:

format:
  html:
    self-contained: true
    code-fold: true
    code-line-numbers: true
    toc: true

Stata Code Blocks

Write Stata code in fenced blocks with {stata}:

```{stata}
sysuse auto, clear
summarize price mpg
```

Cell Options

Control individual blocks with #| comments:

```{stata}
#| echo: false
#| output: true
histogram price, normal
```

Common options:

  • echo: false - Hide code, show output
  • output: false - Show code, hide output
  • eval: false - Display code without running
  • fig-cap: "Caption" - Add figure caption

Inline Code

Embed computed values in text using inline expressions. The syntax is:

The dataset contains `{stata} c(N)` observations.
The average price is $`{stata} di %5.2f r(mean)`.

You would first run computations in a code cell:

#| output: false
quietly summarize price

Then reference results inline. When rendering with actual data, these expressions execute and insert results - keeping numbers synchronized with analysis.

nbstata Magic Commands

nbstata adds commands beyond standard Stata:

Data exploration:

  • %browse [varlist] [if] [in] - Interactive data viewer
  • %head [N] [varlist] - First N rows
  • %tail [N] [varlist] - Last N rows

Diagnostics:

  • %status - System info
  • %locals - View local macros
  • %delimit - Check delimiter

Output control:

  • %%echo / %%noecho - Control echoing
  • %%quietly - Suppress cell output
  • %set / %%set - Modify settings

Examples:

%browse price mpg weight if foreign == 1
%head 5 price mpg weight
%status
local myvar "Hello World"
%locals

Example Analysis

Typical workflow examining how car characteristics relate to price:

sysuse auto, clear
describe
%head 5
#| fig-cap: "Distribution of car prices"
histogram price, normal title("Car Price Distribution")
#| fig-cap: "Price vs. fuel efficiency by origin"
scatter price mpg, by(foreign) title("Price vs. MPG by Origin")
regress price mpg weight
// Add categorical predictor
regress price mpg weight i.foreign
estimates store full_model
#| fig-cap: "Residuals vs. MPG"
predict residuals, residuals
scatter residuals mpg, title("Residuals vs. MPG")

Rendering

In VS Code with Quarto extension, press Cmd/Ctrl + Shift + K for live preview. Updates automatically as you edit.

From command line:

quarto render document.qmd           # HTML
quarto render document.qmd --to pdf  # PDF
quarto render --help                 # See options

Output appears in same directory as source .qmd file.

Troubleshooting

Quarto can’t find nbstata:

  • Check: jupyter kernelspec list (should show nbstata)
  • Verify QUARTO_PYTHON points to correct Python
  • Run: quarto check jupyter

Stata connection fails:

  • Check Stata path in ~/.config/nbstata/nbstata.conf
  • macOS: point to directory containing .app, not .app itself
  • Verify edition (MP/SE/BE) matches config

Document won’t render:

  • Check YAML syntax (use spaces, not tabs)
  • Run quarto check
  • Verify Stata code is syntactically correct

Graphics missing:

  • Check graphics commands work in Stata
  • For PDF: verify LaTeX installation
  • Check file paths if using external images

Slow rendering:

  • Use cache: true for expensive computations
  • Set output: false for data prep steps
  • Use eval: false for example code

Interactive mode unresponsive:

  • Restart Jupyter kernel in VS Code
  • Check for infinite loops
  • Verify Stata isn’t waiting for input

Resources