Television was going to transform society - in a good way. We know how that went. Then it was going to be personal computers. Then the Internet. Now it’s Large Language Models (LLM).
Our intellectual superiors hype Chat GPT, spawn of LLM, as the thing that will finally do it.
Believe them; it will - good and hard, unless we …
First, Cool The Hype
The best hype deflator is cost
… training large general-purpose models is very “costly”. According to the report “How much computing power does ChatGPT need”, the cost of a single training session for GPT-3 is estimated to be around $1.4 million, and for some larger LLMs (Large Language Models), the training cost ranges from $2 million to $12 million. With an average of 13 million unique visitors to ChatGPT in January, the corresponding chip requirement is more than 30,000 Nvidia A100 GPUs, with an initial investment cost of about $800 million and a daily electricity cost of about $50,000.
If the current ChatGPT is deployed to every search conducted by Google, 512820.51 A100 HGX servers and a total of 4102568 A100 GPUs are required, and the total cost of these servers and network is over $100 billion in capital expenditure alone.
The above quote is originally from a Forbes article entitled
ChatGPT Burns Millions Every Day. Can Computer Scientists Make AI One Million Times More Efficient?
Interesting that the question is even being asked. According to Business Insider, it could be costing OpenAI $700,000 a day to run Chat GPT. If true, it’s unsustainable. That’s why Open AI is energetically seeking funding to set up their own network of chip foundries to make chips optimised for LLM.
Second, What Is It?
Chat GPT is an online, algorithmic AI chat system trained on a massive collection of data from internet sources (websites, forums, documents etc.) to provide an intelligible, but not necessarily intelligent, response to questions aka prompts.
Translation; what’s put on the web stays on the web meets there’s money and power in plagiarism.
Version 3.5 is free because OpenAI did not really think there would be much money in it. They needed to demonstrate the power of LLM to sustain their ravanous investment needs. To do it quickly they had to get lots of eyeballs on the product. They initially made the code open source as an incentive to what they thought wrongly, would be a few geeks. Having underestimated the value of their investment, they quickly added some enhancements that were almost ready for release anyway and announced version 4 - it’s costs $$.
Version 3.5 remains free, but Version 3.5 Turbo is by subscription at $20 per month. If you want to use version 3.5 with your own knowledge base, you will have to find and clean your own data, then pre-train the model yourself. See cost above
The GPT part stands for Generative Pre-trained Transformer, a language model developed by OpenAI.
Language Modelling is the use of various statistical and probabilistic techniques to determine the probability of a given sequence of words occurring in a sentence. Language models analyze bodies of text data to provide a basis for their word predictions.
Language modeling is used in artificial intelligence (AI), natural language processing (NLP), natural language understanding and natural language generation systems, particularly ones that perform text generation, machine translation and question answering.
Large language models (LLMs) also use language modeling. These are advanced language models, such as OpenAI's GPT-3 and Google's Palm 2, that handle billions of training data parameters and generate text output.
As of this post, according to OpenAI, the ChatGPT 4 knowledge cut off is April 2023, but includes a disclaimer “not suited yet for production traffic”. For ChatGPT 3.5 it is September 2021. The boundaries of the alleged knowledge are completely obscure, including the stated cut off dates. A number of developers have tried to test the date boundaries with prompts, and there appear to be plenty of anomalies.
We are talking internet data here - the term knowledge should be taken with a grain of salt.
The idea that a reliable cut off date can be valid for all internet data, is absurd. The rate at which the internet is churning is too great for current analytic infrastructure to keep up with. Also, there are such things as data obsolescence and inaccuracy. For my money, discussing source data without discussing currency, accuracy, meaning and authenticity is a waste of time. As is the excitement about scavenging online data, when the vast bulk of human knowledge and wisdom exists outside of the net.
I am confident that the brains at OpenAI are fully aware of these issues. I hope they are also aware that the rest of us will lose confidence in their toy if ordinary Joes and Jills keep poking holes in it. We can only hope that it does not amplify the spread of evil, crime and incompetence that is already mediated by the internet.
Some Old Guy Named Ralph Waldo Emerson1
Permit me a short-ish digression. We will return to the chat thing soon.
The nature of things, a phrase lately contorted into thingness, has been subject to debate for centuries - see philosophy as in Aristotle, Heidegger and others. To study it only takes a year or so to cause utter confusion, so don’t be discouraged if you want to have a go.
Or, take a shortcut with me via one Waldo Emerson, who knew a thing or two about things. In one of Emerson’s poems, Ode, Inscribed to William H. Channing, there is this:
Things are in the saddle, And ride mankind
How true.
A great deal of contemporary philosophical discussion about technology is keyed off that quote; it is almost everywhere I look while researching the societal effects of technology.
The popular interpretations of it seem to be more about analysing pre-conceived notions (technology rules) than considering the context in which it was written. To me it seems clear from the context that Emerson’s real intent was to criticise man for employing technology in irresponsible, even evil ways.
In the same poem, he writes that the “horseman serves the horse” - a perceptive way of putting it, that turns the tables on the idea of service.
Let’s explore this thing called service in that sense.
Now, a horse is both a living thing and a transport thing. It has to be served (bred or captured, trained, fed and cared for) to be a useful thing. It has a variety of roles in which it serves that thing we call man, making a horse a kind of boomerang thing. It is an unlikely weapon, but horses have been known to kill people. That makes it a potentially dangerous thing, as feral mustang mares would testify if they could.
Since it is alive, it may also function as an independent thing when not being served by or serving man. As we say, it sometimes does its own thing.
When doing its own thing, it may not need man’s services at all. Mustangs in the wild function well enough without having to be shod, for example. In other words, being served by and serving man, creates at least one problem for the horse in that it has to be shod, which sheds an avalanche of things that have to do with shoeing horses, as does feeding grooming, stabling etc. Take my word for it, they all are boomerang things.
In contrast, a piece of technology whether robot, computer, software code or some appliance, is an inanimate thing, in the sense of “not having the qualities associated with active, living organisms”. It may be used to make or destroy both animate and inanimate things or to provide a thing we call a service. It can be a dangerous thing at the same time as being a good thing. Because it is inanimate, it must be served by man to perform its function, which enables it to serve man in turn - making it also a boomerang thing, but one that cannot be truly autonomous.
Some technologies serve semi-autonomously after they have been “turned on” or “launched” or “activated”. However, their continued autonomy nevertheless remains subject to man’s service by virtue of their power supplies, which are also made and served by man, without which they cannot continue to do their thing.
So, that’s the difference as I see it, between living things and technological things. Life can function without man but technology cannot.
The problem for man is that whatever man does, he cannot avoid serving and being served by things, even if he is the laziest son of a bitch that ever lived. What he has forgotten is that technology cannot do without him. Instead he has come to believe that it only works the other way round. This grows out of the earlier misperception that technologies would save labour when they only saved muscle power. This in turn fostered the illusion that information technologies will improve brain power. We already know differently but are still in denial.
Cognitive capacity and overall brain power are significantly reduced when your smartphone is within glancing distance—even if it’s turned off and face down
What stands out like a sore thumb to me is that the focus is wrong. We should stop obsessing over the effects of technology on society and turn our collective attention to the effects that society can and should have on technology.
Therein may be found the knowledge of technological good and evil.
End Of Digression - Moving Right Along With Simple Examples
One can get information about almost anything online these days and I count it a very good thing indeed. I have, for example, avoided uncountable cockups by reading or watching the results of a “how to” search, before launching a project. What one gets far too often is not, however, good value for time spent. Most online “how to” searches throw a up beautifully presented tutorials that are utterly useless, while hiding the good stuff one page 31. Allow me to illustrate.
The Crafter, The Baker, the Famous Home Maker
There’s the youtube channel where some dude shows how to craft children’s toys that look something like this:
Without the slightest self consciousness, he will proudly introduce his meagre raw materials, and us, to a workshop that looks like this:
He will, with much pointing at and tapping of objects, while giving his eyebrows a vigorous workout, show us how he employs almost every major appliance in the workshop - to make a little wooden something. The camera will linger on the equipment more than once to make sure we take note of the manufacturer’s logo - who’s owner almost certainly is paying him to do so.
Now, there’s nothing wrong with the toy; it is beautifully proportioned, finished to perfection and and entirely fit for purpose. The workshop is to drool over and good luck to our craftsman for persuading the manufacturers to pay him to make them look good.
But how far up a sponsor does he have to be to make a simple toy using so many powerful machines?
He could have made the thing with a few hand tools. Who knows, that might inspire the odd teenager to get hands dirty with something other than … a cellphone. Best of all, he would have illustrated the fine art of applying the most appropriate tools to the job, making him a useful thing, a teacher.
Many online craft bakers do similarly cringe worthy things. While researching the art of sourdough bread, I once found a dude who wanted to teach the world about it by using only flour milled from one strain of wheat, milled by an antique water mill in some obscure county.
How about the domestic goddess who demonstrates making an omelette using a designer non stick, brand new logoed pan. The kitchen is pristine before, during and after the omelette is cooked and eaten - it is always eaten on camera. There is never any sign of preparation or clean up. Used utensils disappear off camera and clean ones appear out of nowhere. There are mysterious changes of clothing without a break in the narrative.
This is abuse of and by technology. As a result the presenters and their equipment serve neither benefit, nor quality and act irresponsibly by serving up a kind of con.
Chat GPT Can Help Fix That
There are a plethora of search engines and all have the same problem - they don’t understand grammar.
Chat GPT is all about grammar. So why not graft Chat GPT onto search engines? Oh wait - it’s already happening! Now, the likes of Google, Safari and Firefox will certainly manipulate Chat GPT for their own benefit, not ours - why would they stop? So what can be done?
Before the ‘net, we had to do research in libraries, where we mined the card index. Then we mined the relevant volumes in the library. In that way we served the library by justifying it’s existence and the library served us by curating the information we needed. We controlled the search for what we needed., and the library controlled the curation of a wide and deep store of knowledge.
This principle can be applied to all technologies, but most particularly to technologies like Chat GPT.
We could, if we were willing to do the work, learn how to use open source versions of Chat GPT as is, instead of search engines. We will still be served whatever the machines makes of our queries, but at least we will be able level the playing field over time by refining our ability to craft prompts.
Besides, it would do wonders for the quality of online grammar.
That’s a lot of work but why not? The alternative is having the AI noose tied around our necks because we won’t do what it takes to control it before it controls us.
That is how the society thing can productively serve the technology thing and be served by it.
Epilogue
If you are crazy like me and want to have a go, there are many sources online already. A good place to start is by understanding what you are in for if you want to go rogue. PC Magazine has a piece that is good for that.
Alternatively, hold your nose and play with Chat GPT as is or the Microsoft version on Bing, until the software freedom fighters of the world liberate us with open source trained models.