06. HuggingFace Endpoints

Huggingface Endpoints

The Hugging Face Hub is a platform with more than 120,000 models, 20,000 datasets, and 50,000 demo apps, all open source and publicly available. On this online platform, people can easily collaborate and build machine learning together.

Hugging Face Hub also offers a variety of endpoints to build a variety of ML applications. This example shows how to connect to different types of endpoints.

In particular, text generation inference is driven by the Text Generation Inference. These are Rust, Python, and gRPC servers tailored for very fast text generation reasoning.

Issuing a Hugging Face Token

After signing up for the Hugging Face (https://huggingface.co), we apply for the issuance of tokens at the address below.

Token issuer: https://huggingface.co/docs/hub/security-tokens

Reference model list

Hugging Face LLM Leaderboard: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
Model list: https://huggingface.co/models?pipeline_tag=text-generation&sort=downloads
LogicKor leaderboard: https://lk.instruct.kr/

Using Huggingface Endpoints

Python for use huggingface_hub Install package Should do.

Copy

# !pip install -qU huggingface_hub

.env Tokens already issued in the file HUGGINGFACEHUB_API_TOKEN After saving, proceed to the next step roll.

HUGGINGFACEHUB_API_TOKEN Bring.

Copy

from dotenv import load_dotenv

load_dotenv()

Copy

# LangSmith set up tracking. https://smith.langchain.com
# !pip install langchain-teddynote
from langchain_teddynote import logging

# Enter a project name.
logging.langsmith("CH04-Models")

Enter the Hugging Face Token

Copy

from huggingface_hub import login

login()

Generate a simple prompt.

Copy

from langchain.prompts import PromptTemplate

template = """<|system|>
You are a helpful assistant.<|end|>
<|user|>
{question}<|end|>
<|assistant|>"""

prompt = PromptTemplate.from_template(template)

Serverless Endpoints

The Reference API is available for free and rates are limited. If you need a reasoning solution for production, Reference Endpoints Check the service. Reference Endpoints makes it easy to deploy all machine learning models to dedicated and fully managed infrastructure. Choose your cloud, region, computing instance, auto-expansion range, and security level to suit your model, latency, throughput, and compliance requirements.

Here is an example of how to access the Reference API.

Reference

repo_id HuggingFace model on variable repo ID Assign (Storage ID).

microsoft/Phi-3-mini-4k-instruct Model: https://huggingface.co/microsoft/Phi-3-mini-4k-instruct

Copy

import os
from langchain_core.output_parsers import StrOutputParser
from langchain_huggingface import HuggingFaceEndpoint

# Set the repository ID of the model to use..
repo_id = "microsoft/Phi-3-mini-4k-instruct"

llm = HuggingFaceEndpoint(
    repo_id=repo_id,  # Specifies the model store ID.
    max_new_tokens=256,  # Sets the maximum token length to generate.
    temperature=0.1,
    huggingfacehub_api_token=os.environ["HUGGINGFACEHUB_API_TOKEN"],  #Hugging Face Token
)

# LLMChain  Initialize and pass prompt and language model.
chain = prompt | llm | StrOutputParser()
# LLMChain Run and print the results.
response = chain.invoke({"question": "what is the capital of South Korea?"})
print(response)

Copy

The token has not be saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well. 
Token is valid (permission: write). 
Your token has been saved to /Users/teddy/.cache/huggingface/token 
Login successful 
The capital of South Korea is Seoul. Seoul is not only the capital but also the large metropolis in South Korea. It is a bustling city known for its modern skyscrapers, high-tech subways, and pop culture, as well as its historical sites such as palaces, temples, and traditional marks.

Copy

print(response)

Copy

The capital of South Korea is Seoul. Seoul is not only the capital but also the large metropolis in South Korea. It is a bustling city known for its modern skyscrapers, high-tech subways, and pop culture, as well as its historical sites such as palaces, temples, and traditional marks.

Dedicated endpoint

The free serverless API allows you to quickly implement and repeat the solution. However, there may be speed limits in large-capacity use cases because the load is shared with other requests.

For enterprise workloads, Inference Endpoints - Dedicated It is best to use. This gives you access to a fully managed infrastructure that provides more flexibility and speed.

These resources include ongoing support and uptime guarantees, as well as options like AutoScaling.

hf_endpoint_url Set the URL of the Reference Endpoint to the variable.

Copy

# Inference Endpoint URL .
hf_endpoint_url = "https://slcalzucia3n7y3g.us-east-1.aws.endpoints.huggingface.cloud"

Copy

llm = HuggingFaceEndpoint(
    # Set the endpoint URL.
    endpoint_url=hf_endpoint_url,
    max_new_tokens=512,
    temperature=0.01,
)

# Runs a language model for a given prompt.
llm.invoke(input="What is the capital of South Korea?")

Copy

 The token has not be saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well. 
Token is valid (permission: write). 
Your token has been saved to /Users/teddy/.cache/huggingface/token 
Login successful

Copy

'-2022\n Where is the capital of Korea?\n The capital of Korea is Seoul. Seoul is located in the northeast of the country, on the southern shores of the Han River. Seoul is the largest city in the Republic of Korea and one of the most populous cities in the world with a population of about 10 million.\n Seoul was built in Hanyang by the Taeseo heterosexual in 1394. The city has become a capital since the Korean government was established in 1948. Seoul is the center of politics, economy and culture in Korea.\n Seoul is known for its diverse cultural heritage and architecture. The city has five palaces, three royal tombs, and many temples and temples. Seoul also has many museums, galleries and theaters.\n Seoul is also famous for its lively nightlife. There are many bars and clubs in the city, and many international events and festivals are also held.\n Seoul is not only the largest city in Korea, but also one of the most populous cities in the world. The city is famous for its diverse cultural heritage and architecture, and also for its lively nightlife. Seoul is the capital of the Republic of Korea and the center of the country's politics, economy and culture.\n Seoul is the largest city in the Republic of Korea and one of the most populous cities in the world with a population of about 10 million. The city is famous for its diverse cultural heritage and architecture, and also for its lively nightlife. Seoul is the capital of the Republic of Korea and the center of the country's politics, economy and culture.\n Seoul was founded in Hanyang by the Taejo Heterosexual in 1394. The city has become a capital since the Korean government was established in 1948. Seoul is the center of politics, economy and culture in Korea.\n Seoul is known for its diverse cultural heritage and architecture. The city has a 5th palace,  3rd Royal Mausoleum, and many temples and temples. Seoul also has many museums, galleries and theaters.\n Seoul is also famous for its lively nightlife. There are many bars and clubs in the city, and many international events and festivals are also held.\NSeoul is Korea'

Copy

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_teddynote.messages import stream_response

# A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
# Human: {prompt}
# Assistant:

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.",
        ),
        ("user", "Human: {question}\nAssistant: "),
    ]
)

chain = prompt | llm | StrOutputParser()

Copy

chain.invoke("The capital of South Korea is?")

Copy

 'The capital of Korea is Seoul. Seoul is located in the northeast of the country and is one of the largest cities in the world with a population of about 10 million. Seoul is the center of politics, economy and culture, with many government agencies, large corporations, major universities, museums, shopping malls and attractions.'

Previous05. Google Generative AI Next07. (HuggingFace Local) Copy

Last updated 6 months ago