services-flake/doc/ollama.md

# Ollama

[Ollama](https://github.com/ollama/ollama) enables you to easily run large language models (LLMs) locally. It supports Llama 3, Mistral, Gemma and [many others](https://ollama.com/library).

<center>
<blockquote class="twitter-tweet" data-media-max-width="560"><p lang="en" dir="ltr">❄️You can now perform LLM inference with Ollama in services-flake!<a href="https://t.co/rtHIYdnPfb">https://t.co/rtHIYdnPfb</a> <a href="https://t.co/1hBqMyViEm">pic.twitter.com/1hBqMyViEm</a></p>&mdash; NixOS Asia (@nixos_asia) <a href="https://twitter.com/nixos_asia/status/1800855562072322052?ref_src=twsrc%5Etfw">June 12, 2024</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
</center>

## Getting Started

```nix
# In `perSystem.process-compose.<name>`
{
  services.ollama."ollama1".enable = true;
}
```

## Acceleration

By default Ollama uses the CPU for inference. To enable GPU acceleration:

### CUDA

For NVIDIA GPUs.

```nix
# In `perSystem.process-compose.<name>`
{
  services.ollama."ollama1" = {
    enable = true;
    acceleration = "cuda";
  };
}
```

### ROCm

For Radeon GPUs.

```nix
# In `perSystem.process-compose.<name>`
{
  services.ollama."ollama1" = {
    enable = true;
    acceleration = "rocm";
  };
}
```
-												feat(ollama): init

ported from
https://github.com/shivaraj-bh/ollama-flake/blob/main/services/ollama.nix

Also see the nixos module:
https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/services/misc/ollama.nix

											
										
										
											2024-06-11 07:41:00 +03:00
+								# Ollama
-												chore(docs): ollama: better intro, and GPU pointers
											
										
										
											2024-06-13 21:12:56 +03:00
+								[Ollama](https://github.com/ollama/ollama) enables you to easily run large language models (LLMs) locally. It supports Llama 3, Mistral, Gemma and [many others](https://ollama.com/library).
-												feat(ollama): init

ported from
https://github.com/shivaraj-bh/ollama-flake/blob/main/services/ollama.nix

Also see the nixos module:
https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/services/misc/ollama.nix

											
										
										
											2024-06-11 07:41:00 +03:00
-												docs(ollama): Embed X demo

											
										
										
											2024-06-13 21:18:59 +03:00
+								<center>
 								<blockquote class="twitter-tweet" data-media-max-width="560"><p lang="en" dir="ltr">❄️You can now perform LLM inference with Ollama in services-flake!<a href="https://t.co/rtHIYdnPfb">https://t.co/rtHIYdnPfb</a> <a href="https://t.co/1hBqMyViEm">pic.twitter.com/1hBqMyViEm</a></p>&mdash; NixOS Asia (@nixos_asia) <a href="https://twitter.com/nixos_asia/status/1800855562072322052?ref_src=twsrc%5Etfw">June 12, 2024</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
 								</center>
-												feat(ollama): init

ported from
https://github.com/shivaraj-bh/ollama-flake/blob/main/services/ollama.nix

Also see the nixos module:
https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/services/misc/ollama.nix

											
										
										
											2024-06-11 07:41:00 +03:00
+								## Getting Started
 								```nix
 								# In `perSystem.process-compose.<name>`
 								{
 								  services.ollama."ollama1".enable = true;
 								}
 								```
 								## Acceleration
 								By default Ollama uses the CPU for inference. To enable GPU acceleration:
-												chore(docs): ollama: better intro, and GPU pointers
											
										
										
											2024-06-13 21:12:56 +03:00
+								### CUDA
 								For NVIDIA GPUs.
-												feat(ollama): init

ported from
https://github.com/shivaraj-bh/ollama-flake/blob/main/services/ollama.nix

Also see the nixos module:
https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/services/misc/ollama.nix

											
										
										
											2024-06-11 07:41:00 +03:00
 								```nix
 								# In `perSystem.process-compose.<name>`
 								{
 								  services.ollama."ollama1" = {
 								    enable = true;
 								    acceleration = "cuda";
 								  };
 								}
 								```
 								### ROCm
-												chore(docs): ollama: better intro, and GPU pointers
											
										
										
											2024-06-13 21:12:56 +03:00
+								For Radeon GPUs.
-												feat(ollama): init

ported from
https://github.com/shivaraj-bh/ollama-flake/blob/main/services/ollama.nix

Also see the nixos module:
https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/services/misc/ollama.nix

											
										
										
											2024-06-11 07:41:00 +03:00
+								```nix
 								# In `perSystem.process-compose.<name>`
 								{
 								  services.ollama."ollama1" = {
 								    enable = true;
 								    acceleration = "rocm";
 								  };
 								}
 								```