Upload your custom model to 🤗 Hub

Your fine-tuned or LoRA-adapted model is now ready — so how do you share it with teammates, the community, or your own apps? Easy: upload it to the Hugging Face Hub, the world’s biggest open LLM model library.


Why Upload to the Hub?

✔️ Free hosting for models, datasets, and demo Spaces. ✔️ Easy to version, update, and share with a link. ✔️ Load your model from anywhere with from_pretrained(). ✔️ Collaborate — add teammates or make it public. ✔️ Optional: Deploy instantly with Inference API or Spaces.


Step 1️⃣ — Log In to Your Account

Make sure you’re logged in from your terminal:

huggingface-cli login

Paste your token when asked. (You can create a new token under Settings ➜ Access Tokens.)


Step 2️⃣ — Choose a Model Name

Decide what to call it:

  • Be descriptive: my-python-helper-llm

  • Use lowercase and dashes for readability.


Step 3️⃣ — Use transformers to Push

If you used Trainer or PEFT:

✅ Done! Your model now lives at: https://huggingface.co/YOUR_USERNAME/YOUR_MODEL_NAME


If You Used LoRA/PEFT

If you fine-tuned with LoRA, you’ll usually push just the adapter:

When someone wants to use it:


Step 4️⃣ — Add a README.md

A good model card includes:

  • What the model does (and doesn’t do)

  • How it was trained (data, steps, license)

  • Example usage

  • Limitations and disclaimers

The Hub will auto-create a basic README. Edit it in the web UI!


Step 5️⃣ — Make It Public (or Private)

By default, models are public. You can make them private for team use:

Or toggle visibility in the web interface.


Step 6️⃣ — Test It!

Try loading your model from scratch to verify it works:


Good Practices

✔️ Add tags like RAG, LoRA, instruction-tuned to help others find your model. ✔️ Include an example inference.py or Gradio link. ✔️ Pin important files (config.json, tokenizer.json).


🗝️ Key Takeaway

Uploading to the 🤗 Hub turns your local model into a cloud-ready, plug-and-play asset — shareable, versioned, and reusable in any script or app.


➡️ Next: Learn how to serve your model with an Inference API or deploy it with accelerate or FastAPI for live production!

Last updated