Free, effective, and more ethical alternatives to ChatGPT do exist. Here’s how to get started

The democratisation of large language models is here - The Unreasonable Effectiveness of Open Source LLMs.

4 Minute Read.

TLDR:

If you care about privacy, don’t use ChatGPT.

Don't use ChatGPT; use Hugging Chat, which offers free and web-accessible, open-source models—specifically, the 8X7B Mistral. This model generates results comparable to those of ChatGPT 3.5 but is free and has Internet access, a usually paid feature. You can also use Code LLama, the model that powers PHIND.

ChatGPT knows it doesn't care about privacy

If you care about privacy, don’t use Chat GPT, but you don’t have to listen to me; listen to Chat GPT:

But you might be thinking, hey, that’s only for locally hosted models’, of course it’s more private. Let’s ask Chat GPT about what it thinks you should do if you are using a publicly hosted open-source model:

PS. Running a model locally, or using a local setup, can be thought of as being able to download a program, installing it on your PC, and using it without connecting to the Internet.

Concerns about ChatGPT

Similar to many others, I am becoming increasingly concerned about ChatGPT—not just from an ethical standpoint but also regarding productivity. Concerns related to ChatGPT typically involve the following aspects:

  • Data Security: There is no way to confirm what Open AI is doing with your data or the precautions taken to protect it.

  • Production of Unethical or Misleading Content: Without viewing the code, it is impossible for the broader community to determine why such material is generated.

  • Profit Motive of Open AI: Your information could potentially be sold without your knowledge.

  • Closed-Source Nature: Limited visibility exists concerning activities happening behind the scenes.

Should you require additional details on these points, simply enter “ChatGPT” and “safety” into Arxiv.org. The following academic article lays out some specific security concerns. or explore alternative Generative Large Language Models emphasizing human rights or privacy, such as Anthropic. Note, though, that options like Anthropic remain proprietary.

To summarize my concerns, I find problematic any solution labeling itself as the top choice while admitting there could be no objectively defined ‘best.’ Such behavior suggests inherent bias.

Consider comparing responses below—first from ChatGPT, followed by Mistral. Notice how Mistral avoids referencing itself even once:

Once again, don’t rely solely on my judgment; consult Open AI. They expressed the following sentiments in a 2015 blog entry:

OpenAI is a non-profit artificial intelligence research company. Our goal is to advance digital intelligence in the way that is most likely to benefit humanity as a whole, unconstrained by a need to generate financial return.

Since our research is free from financial obligations, we can better focus on a positive human impact. We believe AI should be an extension of individual human wills and, in the spirit of liberty, as broadly and evenly distributed as possible.

Exploring Open Source Alternatives

Issues mentioned earlier apply primarily to projects lacking open-source status. Notably, when ChatGPT initially launched, it intended to be open-source—hence the name ‘Open.’ Regrettably, Open AI has moved away from its original values. Fortunately, several Open Source Alternatives have emerged. Be advised, though, that some of these arise from big tech companies, carrying potential ethical quandaries.

Among the leading open-source generative LLM models currently available are:

  • Facebook’s LLAMA – regular version or Code Llama, tailored for editing code.

  • Google’s Gemma – a smaller yet proficient model for its size range.

  • Mistral – I observed superior outcomes when testing this variant.

To experiment with and employ these and other models cost-free, along with a Web Search functionality (something GPT charges extra), try Hugging Chat.

Run chat models at your home or organisation locally

Ordinarily, operating these expansive chat models demands considerable computational power surpassing average PC capacities. Thus, creating, teaching, and harnessing such technologies tends to be exclusive and price prohibitive. For individuals unable to afford premium machines, small businesses, or non-profits, running these models locally poses formidable challenges.

Recently, large language model democratization began gaining momentum thanks to the arrival of GGUF models. Essentially, these GGUF models serve as wrappers encapsulating established models, trading off slight reductions in precision for substantial improvements in operational efficiency. Today, GGUF models can be executed locally on standard PC setups.

Integrating Open Source Models into Local Workflows for Data Protection

If your PC has 16gb and a newish CPU, running most GGUF models will still be relatively when attempting to emulate ChatGPT level results - i.e up to 1 minute. The speed jump is drastic if you even have a single GPU . If you want simple results or seeking the model to perform repetitive tasks, then the results will be significantly faster - a matter of seconds.

Unfortunately, implementing such models locally is still not straight forward for the typical user. I will lay out two methods to implement models locally or 'more' locally:

  • using a Python wrapper on Hugging Chat - this unofficial API wrapper s not truly local as you will still need to connect it to the internet. However, the advantages are that you can make requests very fast and implement it in a larger custom process of your desire;

  • using a Llama CPP User Interface (not created by Facebook) - basically an app that runs the GGUF models listed above in an easy to use fashion. Despite this approach being local it is difficult to integrate into a custom process;

  • using a Llama CPP wrapper, such as Python - I found that utilising directly using Llama CPP opposed to a UI is more efficient and can also be fully integrated into a local process, however, if you do not know Python it is the most difficult option to implement.

If you are interested in running any of these models locally, message me!

Also huge thank you to Georgi Gerganov, the creator of Llama CPP and GGUF models. I do not know this person but it seems that they are single-handedly doing all of this democratisation work for free. Follow their twitter: https://twitter.com/ggerganov

Tech Solutions for Human Rights logo
Subscribe to Tech Solutions for Human Rights and never miss a post.
#ai#tech democratisation