-
Notifications
You must be signed in to change notification settings - Fork 116
Expand file tree
/
Copy pathdocker_model_package.yaml
More file actions
149 lines (139 loc) · 4.79 KB
/
docker_model_package.yaml
File metadata and controls
149 lines (139 loc) · 4.79 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
command: docker model package
short: Package a model into a Docker Model OCI artifact
long: |-
Package a model into a Docker Model OCI artifact.
The model source must be one of:
--gguf A GGUF file (single file or first shard of a sharded model)
--safetensors-dir A directory containing .safetensors and configuration files
--dduf A .dduf (Diffusers Unified Format) archive
--from An existing packaged model reference
By default, the packaged artifact is loaded into the local Model Runner content store.
Use --push to publish the model to a registry instead.
MODEL specifies the target model reference (for example: myorg/llama3:8b).
When using --push, MODEL must be a registry-qualified reference.
Packaging behavior:
GGUF
--gguf must point to a .gguf file.
For sharded models, point to the first shard. All shards must:
• reside in the same directory
• follow an indexed naming convention (e.g. model-00001-of-00015.gguf)
All shards are automatically discovered and packaged together.
Safetensors
--safetensors-dir must point to a directory containing .safetensors files
and required configuration files (e.g. model config, tokenizer files).
All files under the directory (including nested subdirectories) are
automatically discovered. Each file is packaged as a separate OCI layer.
DDUF
--dduf must point to a .dduf archive file.
Repackaging
--from repackages an existing model. You may override selected properties
such as --context-size to create a variant of the original model.
Multimodal models
Use --mmproj to include a multimodal projector file.
usage: docker model package (--gguf <path> | --safetensors-dir <path> | --dduf <path> | --from <model>) [--license <path>...] [--mmproj <path>] [--context-size <tokens>] [--push] MODEL
pname: docker model
plink: docker_model.yaml
options:
- option: chat-template
value_type: string
description: absolute path to chat template file (must be Jinja format)
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: context-size
value_type: uint64
default_value: "0"
description: context size in tokens
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: dduf
value_type: string
description: absolute path to DDUF archive file (Diffusers Unified Format)
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: format
value_type: string
default_value: docker
description: |
output artifact format: "docker" (default) or "cncf" (CNCF ModelPack spec)
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: from
value_type: string
description: reference to an existing model to repackage
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: gguf
value_type: string
description: absolute path to gguf file
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: license
shorthand: l
value_type: stringArray
default_value: '[]'
description: absolute path to a license file
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: mmproj
value_type: string
description: absolute path to multimodal projector file
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: push
value_type: bool
default_value: "false"
description: |
push to registry (if not set, the model is loaded into the Model Runner content store)
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: safetensors-dir
value_type: string
description: absolute path to directory containing safetensors files and config
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false