JSONFormer
JSONFormer is a library that wraps local Hugging Face pipeline models for structured decoding of a subset of the JSON Schema.
It works by filling in the structure tokens and then sampling the content tokens from the model.
Warning - this module is still experimental
%pip install --upgrade --quiet jsonformer > /dev/null
Hugging Face Baselineโ
First, let's establish a qualitative baseline by checking the output of the model without structured decoding.
import logging
logging.basicConfig(level=logging.ERROR)
import json
import os
import requests
from langchain_core.tools import tool
HF_TOKEN = os.environ.get("HUGGINGFACE_API_KEY")
@tool
def ask_star_coder(query: str, temperature: float = 1.0, max_new_tokens: float = 250):
"""Query the BigCode StarCoder model about coding questions."""
url = "https://api-inference.huggingface.co/models/bigcode/starcoder"
headers = {
"Authorization": f"Bearer {HF_TOKEN}",
"content-type": "application/json",
}
payload = {
"inputs": f"{query}\n\nAnswer:",
"temperature": temperature,
"max_new_tokens": int(max_new_tokens),
}
response = requests.post(url, headers=headers, data=json.dumps(payload))
response.raise_for_status()
return json.loads(response.content.decode("utf-8"))
API Reference:tool
prompt = """You must respond using JSON format, with a single action and single action input.
You may 'ask_star_coder' for help on coding problems.
{arg_schema}
EXAMPLES
----
Human: "So what's all this about a GIL?"
AI Assistant:{{
"action": "ask_star_coder",
"action_input": {{"query": "What is a GIL?", "temperature": 0.0, "max_new_tokens": 100}}"
}}
Observation: "The GIL is python's Global Interpreter Lock"
Human: "Could you please write a calculator program in LISP?"
AI Assistant:{{
"action": "ask_star_coder",
"action_input": {{"query": "Write a calculator program in LISP", "temperature": 0.0, "max_new_tokens": 250}}
}}
Observation: "(defun add (x y) (+ x y))\n(defun sub (x y) (- x y ))"
Human: "What's the difference between an SVM and an LLM?"
AI Assistant:{{
"action": "ask_star_coder",
"action_input": {{"query": "What's the difference between SGD and an SVM?", "temperature": 1.0, "max_new_tokens": 250}}
}}
Observation: "SGD stands for stochastic gradient descent, while an SVM is a Support Vector Machine."
BEGIN! Answer the Human's question as best as you are able.
------
Human: 'What's the difference between an iterator and an iterable?'
AI Assistant:""".format(arg_schema=ask_star_coder.args)
from langchain_huggingface import HuggingFacePipeline
from transformers import pipeline
hf_model = pipeline(
"text-generation", model="cerebras/Cerebras-GPT-590M", max_new_tokens=200
)
original_model = HuggingFacePipeline(pipeline=hf_model)
generated = original_model.predict(prompt, stop=["Observation:", "Human:"])
print(generated)
API Reference:HuggingFacePipeline
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
``````output
'What's the difference between an iterator and an iterable?'
That's not so impressive, is it? It didn't follow the JSON format at all! Let's try with the structured decoder.