I have been wanting to run local LLMs, but I only have Intel Iris XE GPU and most sources out there talking about using the newest and the best like RTX 4090 or something :/
It turns out its not that hard and here is my little journey:
Setup
I will be sharing my experience using the environment below, if its not complete please forgive me and add your info on the comments please :3
- Windows 11 and PowerShell 7
- Python 3.11 with
uv - Laptop with 16GB RAM
Install ipex-llm[cpp] and Ollama
This is the specific build of ipex-llm that is including the Intel-optimized C++ binaries. It allows Ollama's core to communicate with Iris Xe GPU via the SYCL backend. Make sure to add [cpp] because if not its just the standard library.
mkdir intel-ollama
cd intel-ollama
uv venv --python 3.11
. .\.venv\Scripts\Activate.ps1
uv pip install --pre --upgrade ipex-llm[cpp]
Then run the init script
.\.venv\Scripts\init-ollama.bat
DLLs Injection
When I tried to run Ollama, I got this error svml_dispmd.dll not found and a bunch of other dll not found error.
The Intel runtimes should be already installed bundled with the ipex-llm[cpp]. So we just need to add the DLLs to our env. Here is the short script I used to inject those DLLs.
$IntelPaths = Get-ChildItem -Path ".\.venv" -Filter "*.dll" -Recurse |
Select-Object -ExpandProperty DirectoryName -Unique
$env:Path = ($IntelPaths -join ";") + ";" + $env:Path
Graphics Driver
After that, the Ollama launched but when I try to run any model it doesnt work. I got error unsupported SPIR-V version number 'unknown (66560)'. This is because we need at least the driver that have SPIR-V 1.4 version, but my graphics driver is only SPIR-V 1.3.
To fix it I updated the graphics driver to the latest Intel Arc & Iris Xe Graphics Drivers (31.0.101.xxxx or newer) using the Official Updater.
Run Ollama
To run Ollama itself we need to prepare some environment variables. I used this script to do it. I also added the .venv activation and DLLs injection so it would be easier to run.
set_env.ps1
if ($null -eq $env:VIRTUAL_ENV) { . .\.venv\Scripts\Activate.ps1 }
$env:Path = (Get-ChildItem -Path ".\.venv" -Filter "*.dll" -Recurse | Select-Object -ExpandProperty DirectoryName -Unique -join ";") + ";" + $env:Path
$env:SYCL_DEVICE_FILTER = "level_zero:gpu"
$env:ONEAPI_DEVICE_SELECTOR = "level_zero:0"
$env:ZES_ENABLE_SYSMAN = "1"
$env:OLLAMA_INTEL_GPU = "true"
$env:OLLAMA_NUM_GPU = "999"
$env:OLLAMA_CONTEXT_LENGTH = "8192" # for IPEX-LLM
$env:OLLAMA_NUM_CTX = "8192" # for Ollama Server
start_ollama.ps1
. .\set_env.ps1
Stop-Process -Name ollama -Force -ErrorAction SilentlyContinue
.\ollama.exe serve
. .\set_env.ps1Remember to add the dot in the beginning so the env stays in your current terminal.
After the Ollama server is running using .\start_ollama.ps1, you can just open new terminal and run .\set_env.ps1, then use Ollama as usual.
ollama run phi3:mini
Results
With this setup, I tried Phi-3 Mini (3.8B) and it runs with 33/33 layers offloaded to the GPU. Finally I can run LLMs locally >:)


Top comments (0)