Run a Large Language Model (LLM) locally using Node.js

Feb 1, 2024 - 5 min read

react-to-nextjs

(Large Language Models are all the rage these days and for good reason. This mini-blog quickly describes how you can run a specific open source LLM - Ollama from Meta - on your local development machine.)

Jump to Section:

Overview

When it comes to open source Large Language Models much credit is given to Meta with the release of Ollama - which is not only open source but also allows for commercial use without any restrictions. This is a rare approach that should be greatly applauded! This mini-blog quickly shows you how to invoke ollama AI prompts locally using Node.js.

Requirements

  • Node v20.11.0+
  • NPM v10.2.4+
  • 3GB+ Available Storage Space

Full project code for this sample is available on Github.

Download Ollama Client

First you need to download the Ollama client locally. You can download the client through the Ollama Github repository or directly from their website ollama.ai:

ollama-download

Select your operating system, download, and install the app locally on your development machine.

Download Model

Next you need to download an actual LLM model to run your client against. Open the Ollama Github repo and scroll down to the Model Library. Based on your model selection you'll need anywhere from ~3-7GB available storage space on your machine.

ollama-models-library

For example, to download Llama 2 model run:

% ollama run llama2

Once successfully downloaded, you can now start running chat prompts locally on your machine. For example:

% ollama run llama2
% >>> Send a message (/? for help)
% >>> Why is the sky blue?
% 
% The sky appears blue because of a phenomenon called Rayleigh scattering...
% ...

To exit the ollama shell type /bye into the prompt. Once you are able to run local prompts you are now ready to integrate into Javascript source code.

Create Node.js Project

Open a terminal window, create a new directory, and initialize a new project:

% mkdir ollama-test
% cd ollama-test
% npm init

This will create a new package.json file in the current directory with contents similar to:

{
  "name": "ollama-test",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "author": "",
  "license": "ISC"
}

In order to invoke the script we will create we will need to add one line to your package.json file to configure as a module. Add the line "type": "module", such as:

{
  ...
  "type": "module",
  ...
}

Next, install the open source ollama-js NPM package that's required for this sample:

% npm i ollama

Lastly, create an index.js file in the /ollama-test directory with the following source code:

import ollama from 'ollama'

let modelResponse = ""

let chatConfig = {
  model: "llama2",
  role: "user",
  content: "Why is the sky blue?"
}

// check for chat content argument, otherwise use default prompt above
if( process.argv[2] && process.argv[2].length >= 2 ) {
  chatConfig.content = process.argv[2]
}

async function invokeLLM(props) {
  console.log(`-----`)
  console.log(`[${props.model}]: ${props.content}`)
  console.log(`-----`)
  try {
    console.log(`Running prompt...`)
    const response = await ollama.chat({
      model: props.model,
      messages: [{ role: props.role, content: props.content }],
    })
    console.log(`${response.message.content}\n`)
    modelResponse = response.message.content
  }
  catch(error) {
    console.log(`Query failed!`)
    console.log(error)
  }
}

invokeLLM(chatConfig)

Note: If you downloaded a different model than the Llama 2 model you will need to update the model property in the chatConfig object above.

Run Application

You are now ready to start invoking AI prompts programmatically by typing node index.js in a terminal. You should see output similar to:

% node index.js
-----
[llama2]: Why is the sky blue?
-----
Running prompt...

The sky appears blue because of a phenomenon called Rayleigh scattering...
...

The code checks for an extra command line argument which can be used to specify the prompt's content. To use try adding a new content string when you invoke the code, such as:

% node index.js "How old is the universe?"

And you should see output similar to:

-----
[llama2]: How old is the universe?
-----
Running prompt...

The age of the universe is estimated to be around 13.8 billion years...
...

And there you go, you are now able to run ollama prompts on your local machine using Node.js, happy coding!

Check out the full code for this demo here:


Thanks for reading!

by Ergin Dervisoglu (Feb 1, 2024)