Skip to content

jharbieh/phi-ai

Repository files navigation

phi-ai

This repository contains code and notebooks for local ML experiments.

What I added

  • .gitignore to ignore common Python, notebook, and environment files.
  • requirements.txt (core runtime packages) and requirements-dev.txt (dev tools).
  • run_local.ps1 — PowerShell helper to set up a local venv and install dependencies on Windows.
  • tests/test_smoke.py — a minimal smoke test to verify the repo layout.
  • LICENSE (MIT) and CONTRIBUTING.md.
  • GitHub Actions workflow at .github/workflows/python-app.yml that runs lint & tests.

Windows (PowerShell) — Quickstart

These commands assume you are using PowerShell (pwsh.exe) on Windows and Python 3.10+ is installed and on PATH.

  1. Create and activate a virtual environment
python -m venv .venv
# PowerShell activation
.\.venv\Scripts\Activate.ps1
# (if you get an execution policy error, run as Administrator and: Set-ExecutionPolicy -Scope CurrentUser -ExecutionPolicy RemoteSigned)
  1. Install core requirements
pip install --upgrade pip
pip install -r requirements.txt
# Dev tools (optional)
pip install -r requirements-dev.txt
  1. Run the example script or notebook
# Run the main script (if present)
python .\"phi-ai.py"

# Start Jupyter (recommended for notebooks)
jupyter notebook phi-ai.ipynb
# or
jupyter lab

Installing PyTorch (CPU or GPU)

PyTorch is not pinned in requirements.txt because the correct wheel depends on your CUDA toolkit. Follow the official instructions at https://pytorch.org/get-started/locally/ and pick the appropriate command.

CPU-only example (simple and safe):

pip install --index-url https://download.pytorch.org/whl/cpu torch torchvision --extra-index-url https://pypi.org/simple

GPU example (CUDA 11.8) — replace cu118 with the version for your GPU/driver:

pip install --index-url https://download.pytorch.org/whl/cu118 torch torchvision --extra-index-url https://pypi.org/simple

After installing, test GPU detection in Python:

python -c "import torch; print('torch', torch.__version__); print('cuda available:', torch.cuda.is_available())"

Using ONNX for local inference

ONNX (https://github.com/onnx) and ONNX Runtime provide a portable, often faster way to run ML models locally on Windows (CPU and GPU). The steps below cover installing ONNX runtime, exporting a PyTorch model to ONNX, and running inference with onnxruntime from PowerShell/Python.

  1. Install ONNX Runtime and utilities (CPU example):
pip install onnx onnxruntime onnxruntime-tools

GPU example (install the GPU-enabled ONNX Runtime that matches your CUDA version) — consult ONNX Runtime docs at https://onnxruntime.ai:

# Example (replace with the correct package / version for your CUDA toolchain):
pip install onnx onnxruntime-gpu
  1. Export a PyTorch model to ONNX (simple example). Make sure a models/ directory exists first:
mkdir models -ErrorAction Ignore
python - <<'PY'
import torch
model = torch.hub.load('pytorch/vision:v0.14.0', 'resnet18', pretrained=True).eval()
dummy = torch.randn(1, 3, 224, 224)
torch.onnx.export(model, dummy, 'models/resnet18.onnx', opset_version=13, input_names=['input'], output_names=['output'])
print('Exported models/resnet18.onnx')
PY
  1. Run inference with ONNX Runtime (Python example):
python - <<'PY'
import numpy as np
import onnxruntime as ort
img = np.random.rand(1,3,224,224).astype(np.float32)
sess = ort.InferenceSession('models/resnet18.onnx', providers=['CPUExecutionProvider'])
out = sess.run(None, {'input': img})
print('Output shapes:', [o.shape for o in out])
PY

Notes and tips:

  • Match preprocessing (normalization, resizing) used when the model was trained.
  • Use onnxruntime-tools or onnxruntime.transformers to optimize, quantize, or benchmark models for better performance.
  • For transformer / Hugging Face models, check transformers or optimum exporters which can produce ONNX models ready for onnxruntime.
  • Keep exported ONNX files in models/ and avoid committing large model files to Git; add them to .gitignore (already present).

Phi models & ONNX Runtime Generate API

ONNX Runtime includes GenAI tooling that can run generative models (the "Generate" API) locally and is used in ONNX Runtime GenAI tutorials such as the Phi models guide: https://onnxruntime.ai/docs/genai/tutorials/phi3-python.html

Quick guidance for Windows users:

  • Review the official tutorial (link above) for the exact packages, supported model names, and up-to-date installation steps. The GenAI features sometimes require separate runtime packages or wheels that match your CUDA / GPU drivers.
  • In general you'll need:
    • onnx and onnxruntime (or onnxruntime-gpu for GPU),
    • the ONNX Runtime GenAI extension / toolkit (see the tutorial for the exact package name/version), and
    • an exported ONNX model (the tutorial shows how to obtain or export Phi-family ONNX models).

Example (high-level, pseudo-code) — follow the tutorial for exact API and package names:

# Install (example — check the tutorial for exact package names/versions)
pip install onnx onnxruntime onnxruntime-tools
# or for GPU: pip install onnx onnxruntime-gpu <genai-package>
# PSEUDO-CODE: adapt from the ONNX Runtime GenAI tutorial (this is illustrative)
from onnxruntime import GenAIModel  # (see tutorial for the correct import)

# load a Phi-family model (name and availability depend on what the tutorial / model hub exposes)
session = GenAIModel.from_pretrained("phi-3")

result = session.generate("Write a short poem about autumn", max_tokens=80)
print(result.text)

Notes:

  • Always use the exact install and import commands shown in the ONNX Runtime GenAI tutorial linked above — the GenAI API surface and package names can change across releases.
  • Some models require credentials or specific licensing to download; follow the model provider's rules.
  • For best performance, run ONNX Runtime with the appropriate execution provider (CPU, CUDA, DirectML) and consider quantization/optimization tools described in the ONNX Runtime docs.

Notes & recommendations

  • Keep heavy packages (like torch with GPU support) installed manually so you can match CUDA and drivers.
  • Keep data and model artifacts out of git — use data/ and models/ folders (already in .gitignore).
  • If your repository's main Python file uses a dash in its name (phi-ai.py), run it as a script (python "phi-ai.py") instead of importing it as a module.

Running tests

After activating the venv and installing dev requirements:

pip install -r requirements-dev.txt
pytest -q

CI

A GitHub Actions workflow is provided to lint with flake8 and run pytest on push/PR to main.

License

This project is licensed under the MIT License. See LICENSE.

About

All things PHI and running PHI family models locally on Windows

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published