Upload your custom model to 🤗 Hub
Your fine-tuned or LoRA-adapted model is now ready — so how do you share it with teammates, the community, or your own apps? Easy: upload it to the Hugging Face Hub, the world’s biggest open LLM model library.
✅ Why Upload to the Hub?
✔️ Free hosting for models, datasets, and demo Spaces.
✔️ Easy to version, update, and share with a link.
✔️ Load your model from anywhere with from_pretrained().
✔️ Collaborate — add teammates or make it public.
✔️ Optional: Deploy instantly with Inference API or Spaces.
✅ Step 1️⃣ — Log In to Your Account
Make sure you’re logged in from your terminal:
huggingface-cli loginPaste your token when asked. (You can create a new token under Settings ➜ Access Tokens.)
✅ Step 2️⃣ — Choose a Model Name
Decide what to call it:
Be descriptive:
my-python-helper-llmUse lowercase and dashes for readability.
✅ Step 3️⃣ — Use transformers to Push
transformers to PushIf you used Trainer or PEFT:
✅ Done! Your model now lives at:
https://huggingface.co/YOUR_USERNAME/YOUR_MODEL_NAME
✅ If You Used LoRA/PEFT
If you fine-tuned with LoRA, you’ll usually push just the adapter:
When someone wants to use it:
✅ Step 4️⃣ — Add a README.md
README.mdA good model card includes:
What the model does (and doesn’t do)
How it was trained (data, steps, license)
Example usage
Limitations and disclaimers
The Hub will auto-create a basic README. Edit it in the web UI!
✅ Step 5️⃣ — Make It Public (or Private)
By default, models are public. You can make them private for team use:
Or toggle visibility in the web interface.
✅ Step 6️⃣ — Test It!
Try loading your model from scratch to verify it works:
✅ Good Practices
✔️ Add tags like RAG, LoRA, instruction-tuned to help others find your model.
✔️ Include an example inference.py or Gradio link.
✔️ Pin important files (config.json, tokenizer.json).
🗝️ Key Takeaway
Uploading to the 🤗 Hub turns your local model into a cloud-ready, plug-and-play asset — shareable, versioned, and reusable in any script or app.
➡️ Next: Learn how to serve your model with an Inference API or deploy it with accelerate or FastAPI for live production!
Last updated