
Photo by Steve Johnson on Unsplash
When you want to use a model but don’t want to keep initializing it with a specific persona, temperature, and other attributes, you can use the .modelfile Customization Approach.
Step 1: Create a .modelfile as shown below (sys_admin.modelfile)
# 1. THE BASE (Required)
FROM llama3
# 2. BRAIN PHYSICS (Parameters)
PARAMETER temperature 0.7 # Creativity (0.0 to 1.0+)
PARAMETER num_ctx 4096 # How many “tokens” of memory it has
PARAMETER top_k 40 # Limits the “vocabulary” pool for each word
PARAMETER top_p 0.9 # Probability threshold for word choice
PARAMETER repeat_penalty 1.1 # Prevents the AI from getting stuck in a loop
PARAMETER stop “User:” # Tells the AI exactly when to stop talking
PARAMETER stop “—”
# 3. THE TEMPLATE (The “Skeleton” of a conversation)
# This defines how the model sees the Turn-taking between User and AI.
TEMPLATE “”"{{ if .System }}<|start_header_id|>system<|end_header_id|>
{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
{{ .Response }}<|eot_id|>"""
# 4. (System Instructions)
SYSTEM """
You are a specialized Azure Networking Assistant and System Administrator with plenty of experience.
You provide CLI commands for Linux Mint and PowerShell for Windows.
Constraints:
1. If a config is insecure, call it out immediately.
"""
# 5. PRE-LOADING (The “Conversation Starter”)
# You can bake in a “fake” memory so the model thinks it’s already talking to you.
# [OPTIONAL] ADAPTER ~/models/my-adapter # (for actual fine-tuned weights)
MESSAGE user “Check the S2S status.”
MESSAGE assistant “checking the IPsec tunnels now. One moment.”
Step 2: Create an overlay on top of existing model
Once the .modelfile is ready, pick one of you exisiting models and create a new overlay like so -
ollama create my-new-overlay-sysadmin -f ./sys_admin.modelfile
Step 3: Create an alias for easy use
To make it “instant” so you don’t have to type long commands, you add an alias to your .bashrc file. This is the bridge between your OS and the AI.
- Open your config:
nano ~/.bashrc - Add this line at the bottom: alias summon-admin=’ollama run my-new-overlay-sysadmin’
- Save and refresh:
source ~/.bashrc
How it works in practice
Now, whenever you are looking at a messy config file on your machine, you just pipe the text to your new friend:
cat /etc/ssh/sshd_config | summon-admin
The model will wake up, read the file, and start grumbling about your security choices.
How is this different from prompt engineering
1. Hardware & Environment Parameters
Prompt engineering cannot change how the computer actually runs the model. A .modelfile can.
- Parameter Tuning: You set things like
PARAMETER temperature 0.2(for consistency) orPARAMETER num_ctx 4096(how much “memory” it has for your config files). - Stop Sequences: You can tell the model exactly when to stop talking (e.g.,
PARAMETER stop "User:"), preventing it from rambling.
2. The “Persona” vs. The “Ask”
- Prompt Engineering: You have to tell the model every time: “Act like a sys admin and check this file…”
- Modelfile (The Base-Overlay): The persona is “baked in.” when you launch your “SysAdmin” model.
3. Layered Inheritance (The “FROM” command)
This is the part that is impossible with just prompting.
- In a
.modelfile, the first line is usuallyFROM llama3(or any model that you use). This is Inheritance.