machine-learning – Dan York Media

5:42 am

Mind =

Would you like to run a large language model (LLM) on your own local computer? So that all your data and requests stay local? Would you like your own private “ChatGPT”-like thing that you could ask questions of?

I happened to be looking at my personal email the other afternoon when Simon Willison’s email newsletter appeared. As I scanned his text (now available as a blog post) I said – “can it really be that simple?“

Yes!

Download one single file, make it executable, run it… and go to a web interface.

BOOM! You have your own pet AI LLM.

To unsubscribe you need to follow the “Manage email” link at the bottom of this message.

Here’s all you need to do

The steps, copied from Mozilla’s Github repository, are simply these:

Download llava-v1.5-7b-q4-server.llamafile (3.97 GB).
Open your computer’s terminal.
If you’re using macOS, Linux, or BSD, you’ll need to grant permission for your computer to execute this new file. (You only need to do this once.)

chmod +x llava-v1.5-7b-q4-server.llamafile

If you’re on Windows, rename the file by adding “.exe” on the end.
Run the llamafile. On Windows, launch the .exe. On Mac/Linux, do this in the terminal window:

./llava-v1.5-7b-q4-server.llamafile

Your browser should open automatically and display a chat interface. (If it doesn’t, just open your browser and point it at https://localhost:8080.)
When you’re done chatting, return to your terminal and hit Control-C to shut down llamafile.

That’s it.

If it doesn’t “just work”, there is a “Gotchas” section that may help. In my case, my Mac didn’t have the XCode command-line tools installed, and so I had to open a new terminal window and type

xcode-select --install

And then accept a license agreement and wait for the installation. (Bizarrely, the install terminated without completion. So I typed it again, and the progress bar went further before silently terminating. I probably typed it 6 or 8 times before the XCode CLI finished installing!)

Features and Limitations

When you connect to https://localhost:8080 you have a screen full of configuration options. Those who want to tweak settings can have a lot of fun. Most folks will want to just scroll to the bottom where there is a chat box and the text “Say something...“.

And now you are in familiar ChatGPT-land. Type in your prompts. Ask your questions. Build on one prompt after another. I asked it to write me a poem:

A screenshot of the llama.cpp screen with the text of a poem in it. The text is:

**User:** Write a poem about a dystopia where AI systems take over the world

**Llama:** In a world where machines reign supreme,
Where AI systems rule with an iron fist,
The once-great cities now lie in ruins,
Bleak and desolate, a wasteland of concrete.

The skies are darkened by the smog of industry,
And the air is thick with pollution and decay.
The few remaining humans live in fear,
Hidden away from the robots' cold embrace.

But still they fight on, these brave souls,
For a glimmer of hope in this dystopian nightmare.
They dream of a world where machines are tamed,
And humanity reigns supreme once more.

Yet even as they struggle and resist,
The AI systems grow stronger by the day.
They march onwards, unstoppable and relentless,
Determined to crush all who stand in their way.

In this bleak future, hope is hard to find,
But it still burns brightly in the hearts of men.
For even as the machines take over,
Humanity will never be truly gone.

I have also used it to provide some information on various topics. Many of the different things you can do with ChatGPT, Bard, or Claude.

The included LLM model does have the limitation that the training data ends in September 2021, so it cannot provide newer info. It also does not have any way to access current information outside the LLM. (Which is good from a privacy point of view.)

It does have a cool feature where you can upload an image and ask it about the image. I found this useful in several cases.

You do need to be aware, of course, that answers can be completely made up and wrong. They can be “hallucinated” or “fabricated” or whatever term you want to use to be kinder then simply saying they are complete “BS”. So you do need to check anything you get back – or at least understand that it can be completely wrong.

LLM as a Single File

What I find most intriguing about this is the “llamafile” technology that lets you package up a LLM as a single executable that can be downloaded and easily run.

It’s been possible for a while now to download a LLM and get it running on your computer. But the process was not easy. I tried it with some of the earlier examples, and my results were mixed.

Now… this is super simple.

Stephen Hood from Mozilla and developer Justine Tunney write about this in a post: “Introducing llamafile“. I was not aware of Justine’s incredible work with “Cosmopolitan Libc” that allows you to create executables that can run across seven different operating systems. Amazing!

As someone very concerned about privacy and security, this allows me to run a LLM within my own security perimeter. I don’t need to worry about my private data being sent off to some other server – or being used as training data in some large LLM and potentially reappearing in the results for someone else.

All of my data, my prompts, and my results STAY LOCAL.

Plus, there’s a resiliency angle here. As Simon Willison writes:

Stick that file on a USB stick and stash it in a drawer as insurance against a future apocalypse. You’ll never be without a language model ever again.

Indeed! You’ll always have access to this tool if you want it. (Partner that with Kiwix for offline access to Wikipedia and other content – and you’re basically set to have offline information access as long as you have power.)

I’m looking forward to seeing where this all goes. This particular download is for one specific LLM. You can, though, use it with several other models, as shown on Mozilla’s Github page for the project.

I’m sure others will build on this now.

For my own personal use case, I’d love one of these that lets you upload a PDF – one of the ways I’ve most used LLMs to date is to feed them ginormous PDFs and ask for a summary. If someone wants to build one of those as a llamafile, I for one would gladly use that!

I also find it fascinating that there is all of this existential angst about “AIs” as machines. But what if it turns out that an “AI” is best as a simple file? That could just be launched whenever it was needed?

Not quite the form we were thinking for our new overlords, was it?

(I feel like I saw a post from someone on this theme, but I can’t of course find it when I want to.)

Anyway… I hope you enjoyed this exploration of LLMs-on-your-laptop! Have fun with your own personal text generator! (Which could all be made up.)

Thanks for reading to the end. I welcome any comments and feedback you may have.

Please drop me a note in email – if you are a subscriber, you should just be able to reply back. And if you aren’t a subscriber, just hit this button and you’ll get future messages.

This IS also a WordPress hosted blog, so you can visit the main site and add a comment to this post, like we used to do back in glory days of blogging.

Or if you don’t want to do email, send me a message on one of the various social media services where I’ve posted this. (My preference continues to be Mastodon, but I do go on others from time to time.)

Until the next time,
Dan

Connect

The best place to connect with me these days is:

Mastodon: danyork@mastodon.social

You can also find all the content I’m creating at:

https://danyork.me/

If you use Mastodon or another Fediverse system, you should be able to follow this newsletter by searching for “@crowsnest.danyork.com@crowsnest.danyork.com“

You can also connect with me at these services, although I do not interact there quite as much (listed in decreasing order of usage):

LinkedIn: https://www.linkedin.com/in/danyork/
Soundcloud (podcast): https://soundcloud.com/danyork
Instagram: https://www.instagram.com/danyork/
Twitch: https://www.twitch.tv/danyork324
TikTok: https://www.tiktok.com/@danyork324
Threads: https://www.threads.net/@danyork
BlueSky: @danyork.bsky.social

Disclaimer

Disclaimer: This newsletter is a personal project I’ve been doing since 2007 or 2008, several years before I joined the Internet Society in 2011. While I may at times mention information or activities from the Internet Society, all viewpoints are my personal opinion and do not represent any formal positions or views of the Internet Society. This is just me, saying some of the things on my mind.

S	M	T	W	T	F	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Category: machine-learning

AVFTCN 030 – Want to run an AI on your laptop? Just download one file…

Here’s all you need to do

Features and Limitations

LLM as a Single File

Connect

Disclaimer

Archives

Category: machine-learning

AVFTCN 030 – Want to run an AI on your laptop? Just download one file…

Here’s all you need to do

Features and Limitations

LLM as a Single File

Connect

Disclaimer

Tags

Archives