Microsoft Introduces OpenAI’s Smallest Open Model (gpt‑oss‑120b and gpt‑oss‑20b) for Windows Users.

Microsoft Introduces OpenAI’s Smallest Open Model for Windows Users.
Microsoft Introduces OpenAI’s Smallest Open Model for Windows Users.

Microsoft Introduces OpenAI’s Smallest Open Model for Windows Users.

Introduction to the New Open‑Weight Models

OpenAI has placed its latest open‑weight models, gpt‑oss‑120b and gpt‑oss‑20b, onto Microsoft’s AI platforms. Developers can now pull these models into Azure AI Foundry or run them directly on Windows devices with Foundry Local.
The move gives full control over the model files, allowing tweaks, fine‑tuning, and deployment without a cloud‑only lock‑in.
Both models are built for real‑world workloads, delivering strong reasoning or tool‑use while staying efficient enough for single‑GPU or on‑device inference.
Read more about the launch on the official blog here.

Why Open‑Weight Matters

  • Full visibility into each parameter.
  • Ability to apply LoRA, QLoRA, or other parameter‑efficient methods.
  • Easier to compress, quantize, or prune for memory‑constrained environments.
  • Supports exporting to ONNX or Triton for containerized serving.

These benefits translate into faster iteration cycles. Teams that have tried the models report checkpoint updates in hours instead of weeks.

Azure AI Foundry Overview

Azure AI Foundry acts as a one‑stop shop for building, tuning, and serving AI agents.

Core Features

  1. Model catalog – Over 11 000 entries, now including gpt‑oss‑120b and gpt‑oss‑20b.
  2. Training pipelines – Managed compute for fine‑tuning with LoRA, QLoRA, and PEFT.
  3. Secure serving – Low‑latency endpoints protected by Azure’s compliance stack.

Getting Started in Azure

  • Open Azure Cloud Shell.
  • Run az findry model create --name gpt-oss-120b --sku Standard_ND96asr_v4.
  • Deploy the endpoint with a single CLI call.

The process finishes in minutes, depending on network speed.

Windows AI Foundry and Foundry Local

Windows AI Foundry extends the same capabilities to personal computers.
Foundry Local runs as a lightweight service that downloads the best‑fit binary for the hardware present.

System Requirements

ComponentMinimumRecommended
OSWindows 10 (x64), Windows 11 (x64/ARM)macOS, Windows Server 2025
RAM8 GB16 GB
Disk3 GB free15 GB free
GPU (optional)Any recent NVIDIA/AMD/IntelNVIDIA 2 000‑series+, AMD 6 000‑series+, Intel iGPU, Qualcomm Snapdragon X Elite, Apple silicon
Admin rightsYesYes

Installation Steps

  • Windows – Open PowerShell and execute winget install Microsoft.FoundryLocal.
  • macOS – Run brew tap microsoft/foundrylocal then brew install foundrylocal.

For those who prefer a manual download, the installer is on the project’s GitHub page.

## Running Your First Model

After the service starts, open a terminal and type:

foundry model run phi-3.5-mini

The model will download, then you can ask simple questions:

Why does leaf fall?

The reply appears directly in the console.

To swap in another model, replace phi-3.5-mini with any catalog entry, for example gpt-oss-20b.

Using gpt‑oss‑20b on a Local Machine

Running gpt‑oss‑20b requires a GPU with at least 16 GB VRAM and Foundry Local version 0.6.87 or newer.

foundry model run gpt-oss-20b

Check the installed version with foundry --version.

If the command fails, verify the GPU driver and ensure the CUDA toolkit is present.

Fine‑Tuning and Optimization

Open‑weight models invite developers to adapt them for niche domains.

  • Parameter‑efficient tuning – Apply LoRA adapters in minutes.
  • Quantization – Reduce precision to 4‑bit for faster inference on edge devices.
  • Distillation – Create a smaller student model that mimics the large teacher.
  • Structured sparsity – Trim unused weights to meet strict memory limits.

All these steps are supported through Azure AI Foundry pipelines or locally via the Foundry CLI (foundry model fine‑tune, foundry model quantize).

Managing the Local Cache

Downloaded models sit in a cache folder on the device.

foundry cache list
foundry cache clean --max-size 10GB

Regular cleaning prevents disk bloat, especially when testing multiple variants.

Upgrading and Uninstalling

Staying current ensures compatibility with the newest model releases.

  • Upgrade on Windowswinget upgrade --id Microsoft.FoundryLocal.
  • Upgrade on macOSbrew upgrade foundrylocal.

To remove the service:

  • Windows – winget uninstall Microsoft.FoundryLocal.
  • macOS – brew rm foundrylocal && brew untap microsoft/foundrylocal && brew cleanup --scrub.

Real‑World Use Cases

Enterprise Knowledge Assistant

A large retailer integrated gpt‑oss‑120b into their internal search platform.
The model answered policy questions in under two seconds, reducing support tickets by 30 %.

Edge Device Automation

A robotics startup deployed gpt‑oss‑20b on Windows laptops mounted on autonomous drones.
The model executed planning commands without contacting the cloud, preserving bandwidth.

Academic Research

A university lab used the open weights to explore bias mitigation.
They rewrote attention layers and published a paper on transparent LLM adjustments.

Security and Governance

Both Azure AI Foundry and Foundry Local ship with built‑in content safety modules.

  • Input filtering blocks disallowed requests.
  • Audit logs record every inference call for compliance tracking.

Open models also allow manual inspection of attention maps, supporting independent security reviews.

Community and Support

The Foundry ecosystem includes a vibrant developer forum, GitHub issues, and a dedicated tech‑community page.
New contributors can submit pull requests for model wrappers or sample pipelines.

Pricing Snapshot

  • Azure AI Foundry usage follows the standard Managed Compute rates – see the pricing page here.
  • Foundry Local itself is free; only hardware and electricity costs apply.

All pricing reflects August 2025 rates.

Tips for a Smooth Experience

  1. Verify GPU drivers are up‑to‑date.
  2. Keep the CLI version aligned with the model catalog.
  3. Use the foundry model list command to see current compatibility.
  4. Reserve at least 15 GB of disk space for caching.
  5. Test with a small model (phi‑3.5‑mini) before pulling the 20 B variant.

Future Directions

Open‑weight releases are expected to continue, expanding the catalog beyond language models to vision and multimodal systems.
Microsoft plans tighter integration with Windows AI Foundry, allowing seamless switching between cloud and edge at runtime.

Quick Reference Cheat Sheet

ActionCommand (Windows)Command (macOS)
Install Foundry Localwinget install Microsoft.FoundryLocalbrew tap microsoft/foundrylocal && brew install foundrylocal
Run phi‑3.5‑minifoundry model run phi-3.5-miniSame
Run gpt‑oss‑20bfoundry model run gpt-oss-20bSame
List modelsfoundry model listSame
Check versionfoundry --versionSame
Upgradewinget upgrade --id Microsoft.FoundryLocalbrew upgrade foundrylocal
Uninstallwinget uninstall Microsoft.FoundryLocalbrew rm foundrylocal && brew untap microsoft/foundrylocal

Conclusion

Open‑weight models on Azure and Windows give developers the freedom to experiment, customize, and ship AI solutions without compromising on performance or security.
The combination of Azure AI Foundry’s managed services and Foundry Local’s on‑device runtime creates a flexible stack that fits any workflow.
If you have a GPU‑enabled Windows machine or access to Azure compute, you can start today by installing Foundry Local and pulling the gpt‑oss models.

Additional Resources

  • Official blog post announcing the models link
  • Foundry Local getting‑started guide link
  • Azure AI Foundry model catalog page
  • GitHub repository for Foundry Local (search “Microsoft/FoundryLocal”)

Explore, tweak, and deploy – the new era of open AI is already here.

ArtGenie AI Review – Introducing the World’s First AI App That Generates High-Quality Stunning Graphics and Designs for Websites, Blogs, Landing Pages, Social Media, and Businesses with One Click from a Single Dashboard

Mastering B2B Social Selling: The Complete Guide to Relationship-Driven Revenue Growth

The Simple Online Method for Unlimited Passive Income

How to Write Better AI Prompts, According to Anthropic

AI CONTENT SNIPER Deep Review: This Plugin Automatically Generates Complete Blog Posts (How-Tos, Listicles, Reviews, You Name It), Injects Affiliate Links, Adds Images from Pixabay, Pexels, or OpenAI, and Publishes Them in Seconds

Subscription Form