/
LangChain, an on-premises CPU LLM, and a basic Prompt

LangChain, an on-premises CPU LLM, and a basic Prompt

Self-hosting Large Language Models?

  • Possible

Are they as capable as SaaS subscription models?

  • No, but they have custom use-cases.

 

LLMWare Bling on a CPU

Utilize the Hugging Face pipeline for easy access to a pre-trained model. LLMware Bling, being a CPU LLM, is part of this setup. The model's configuration enables remote code execution from Hugging Face.

 

https://huggingface.co/llmware/bling-stable-lm-3b-4e1t-v0

bling-stable-lm-3b-4e1t-0.1 part of the BLING ("Best Little Instruction-following No-GPU-required") model series, RAG-instruct trained on top of a StabilityAI stablelm-3b-4e1t base model.

BLING models are fine-tuned with distilled high-quality custom instruct datasets, targeted at a specific subset of instruct tasks with the objective of providing a high-quality Instruct model that is 'inference-ready' on a CPU laptop even without using any advanced quantization optimizations.

 

 

from langchain_community.llms.huggingface_pipeline import HuggingFacePipeline from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline model_id = "llmware/bling-stable-lm-3b-4e1t-v0" # Ensure the directory for saving models is created and specified in your environment # This is more about ensuring that the model download doesn't prompt for storage location or confirmation import os from transformers import logging # Optionally, increase logging level if you want to see more details about the download process logging.set_verbosity_info() # Make sure you have set TRANSFORMERS_CACHE in your environment variables # os.environ["TRANSFORMERS_CACHE"] = "/path/to/your/preferred/cache/directory" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True) pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens=500) hf = HuggingFacePipeline(pipeline=pipe)

 

Prompt for the model with LangChain

from langchain.prompts import PromptTemplate template = """Question: {question} Answer: Let's think step by step.""" prompt = PromptTemplate.from_template(template) chain = prompt | hf question = "What is electroencephalography?" test = chain.invoke({"question": question})

 

Listing and GitHub repo

The answer to this very relevant question can be found in the listing:

 

Related content

LangChain, GPT-4 and a security motivation for Agents
LangChain, GPT-4 and a security motivation for Agents
More like this
My ML Dev env
More like this
AutoML with tpot in a Linux VM, no GPU (15.1.2024)
AutoML with tpot in a Linux VM, no GPU (15.1.2024)
More like this
Tpot with PyTorch for AutoML Neural Network generation in Google Colab (GPU)
Tpot with PyTorch for AutoML Neural Network generation in Google Colab (GPU)
More like this
Automate OneDrive Personal backups with rclone on an Nvidia Shield TV Pro (multi-day job)
Automate OneDrive Personal backups with rclone on an Nvidia Shield TV Pro (multi-day job)
More like this
Data Science with Java - Ahead of Time (AoT) (2019)
Data Science with Java - Ahead of Time (AoT) (2019)
More like this