This—100%. The AI revolution with LLMs is creating a new type of interface with ...

MatthiasPortzel · on July 3, 2023

This is a perfect example of an AI-hype comment.

You’re treating it as a fact that LLM are going to replace existing products, at some unknown future date.

“In 5 years, all code will be written by AI”

“In 5 years, LLM will replace Siri and Alexa”

“In 5 years, AI will replace [sector of jobs]”

The thing that frustrates me about these statements is that you don’t know what AI technology is going to look like in 5 years, so stop treating it like a fact. It’s possible LLM are useful in all of these places, but we don’t know that yet.

blitz_skull · on July 7, 2023

I do know, for a fact, that having a more capable and powerful voice assistant than, the already fairly capable, Siri, will be a game-changer (for me at a bare minimum, but I’m not that special so I think it’s safe to extrapolate that to more people).

That’s a fact.

I also know that voice-interfaces to date have been incredibly stiff and there is ample room for improvement. I know, for a fact, that having AI enable better voice interfaces will make computing better and more accessible. I have a hard time understanding how those are hype-driven comments and/or opinions.

We do know these things for a fact. Not being able to articulate exactly which breakthroughs will be most important doesn’t make it hype.

kazinator · on July 6, 2023

LLM is obviously useful for something like Siri, Alexa or Google Assistant, or so you would think.

There doesn't seem to be a rush because it makes the implementation a lot more expensive, and those things are, I suspect, not profitable products (revenue sources) to their respective companies. They are a kind of enhancement to a layer of products and services; people take them for granted now and so you can't take them away.

A smarter Google Assistant would do nothing for Google's bottom line, and in fact it would cost more money to operate.

If it's not done right, it could ruin the experience. For instance, it cannot have worse latency on common queries than the old assistant.

plutonorm · on July 3, 2023

GPT4 just wrote a python script for me that downloaded a star catalogue, created a fish eye camera model, and then calculated the position of the camera relative to the stars by back propagating the camera position and camera parameters to match the star positions.

All I did was hold it's hand, it wrote every line of code. You are living in fantasy land if you think we will be writing lines of code in 10 years.

alpaca128 · on July 3, 2023

> You are living in fantasy land if you think we will be writing lines of code in 10 years.

I was with you until that sentence. No, LLMs will not write all our code and the reason is very simple: coding is easier than reviewing code. Not to mention the additional complexities and weirdness that we've always dealt with without even thinking about it.

We can see in Photoshop what's coming for developers: context-sensitive AI autocompletion and gap filling. Copilot but more mature and integrated, perhaps with additional checks that prevent some bugs being inserted. And troubleshooting, the area where I think we can profit the most.

Tainnor · on July 3, 2023

that's all stuff that would be impressive for a single human to be able to produce instantly (because nobody remembers all these APIs), but that's still formulaic enough that it's not hard to imagine why ChatGPT succeeds at it

but will ChatGPT help you debug and fix a production issue that came about due to a Kafka misconfiguration? will it be able to find the deadlock in your code that is causing requests to be dropped? will it suggest a path forward when you need to replace an obscure library that hasn't been updated in 5 years? will it be able to make sense of seemingly contradictory business requirements?

Slartie · on July 4, 2023

That's not exactly the complexity of typical software that must solve an actual, difficult, business problem.

Wake me up when ChatGPT is able to write and maintain a POS system, or an online store with attached fulfillment management. Anything that goes beyond a fancy 100-line script. Anything that people actually hire teams of senior devs, business analysts and software architects for.

mattkrause · on July 3, 2023

Do you know if it works?

CyberDildonics · on July 3, 2023

Lets see it and lets see the prompts you used.

rvz · on July 3, 2023

Exactly. The AI bros here are doing the same thing as the crypto bros and almost all of them don't even know it.

Pontificating their nonsense around the hype about LLMs to the point where they don't even trust it. The same thing they did with ConvNets and they still don't trust that either since they both hallucinate, frequently.

I can guarantee you that people will not trust an AI to fly a plane without any human pilots on board end-to-end (auto-pilot does not count) and it is simply due to the fundamental black-box nature of these so-called 'AI' models being untrustworthy in high risk situations.

aenis · on July 3, 2023

I'd like to point out that humans, too, are not trustworthy in high risk situations. For this we have procedures, deterministic automation and so on.

I like to think of capable LLMs as pf gifted interns. I can expect decent results if I explain well enough, but I need processes around them to make sure they are doing what they are told. In my industry thats enough to produce a noticeable productivity gain, and likely some reduction of employment as its a low margin cut throat business relying on low grade knowledge workers. I see the hype and honestly cant stand it, but its measureably impacting my industry and the world around me.

rvz · on July 3, 2023

> I'd like to point out that humans, too, are not trustworthy in high risk situations. For this we have procedures, deterministic automation and so on.

Except humans can transparently explain themselves and someone can be held to account when something goes wrong. Humans have the ability to have differing opinions and approaches to solve unseen problems.

An AI however cannot explain itself transparently and just resorts to regurgitating whatever output it has been trained on and black-box AI models have no clear method of any transparent reasoning meaning that it cannot be held to account.

Any unseen problem it encounters, it falls back to fixed guardrails and just repeats a variation or re-wording on what it already has said. Especially LLMs.

alpaca128 · on July 3, 2023

> Except humans can transparently explain themselves and someone can be held to account when something goes wrong

Except humans are excellent at finding excuses to avoid explaining themselves and being held to account, or to justify some misguided belief based on whatever output they have been "trained on" in their past.

People often seem to apply standards to AI in terms of rationality and reliability which even many humans cannot achieve, using terms like "hallucination" when we've seen humans do the exact same by confidently talking about things they know nothing about. Everyone laughed at Bing insisting on a wrong date to avoid admitting it's wrong about the Avatar 2 release, when that's very typical behaviour of humans in certain situations.

I'm not trying to make LLMs seem better than they are, but parts of its weaknesses are not surprising given the training data.

valval · on July 3, 2023

What would you prefer to talk about? We don’t have to make predictions and discuss their potential, or at least you don’t have to join those discussions.

MatthiasPortzel · on July 3, 2023

A lot of these comments aren’t predictions. They’re assuming that openAI will create AGI in the next 5 years and they want to discuss the implications of that.

Personally, I think LLMs are a step forward, but I suspect that GTP-4 is close to the limit of what’s possible with LLMs. I don’t think we’re going to see AGI from the same approach.

healthyusa · on July 3, 2023

GPT-4 writes 100% of my code now. Staring at a monitor, hunched over, tapping on a keyboard?

Stone ages. That’s not 5 years from now. That’s today.

satisfice · on July 3, 2023

You are either full of shit, or your "coding" is pretty basic, or your code is full of bugs and you don't care.

I can't trust GPT, and neither can you. But if it really can do all your coding for you, what stops your employer from replacing you with a secretary from a temp agency?

It's so stupid for engineers to say that ChatGPT codes for them. They are shooting themselves in the face. They are devaluing the entire profession. Why? My reaction to all those breathless online demos was to point out the difference between what they were showing and what an engineer really does. Your reaction is to act like being a prompt jockey is the new way of engineering. How does that give you pride in yourself?

tiborsaas · on July 3, 2023

Do you work much with legacy systems, internal libraries and work with a large team?

I do and ChatGPT code is rarely useful for me. I can prompt it well enough to do language related stuff for me, but the code it can write for me is more like a highly custom boilerplate that I still need to refactor.

Even for green field private projects, at first it looks fine, bit the bugs are more likely to be traced back to these snippets than not.

retendo · on July 3, 2023

Can you elaborate what your process is? Some context would be nice as well. Like, what kind of language, what kind of project? I'm genuinely interested.

st-angel · on July 3, 2023

Pretty sure they're joking

afavour · on July 3, 2023

> The era of having a personal voice assistant that is capable, adaptable, and intuitive is VERY close

Year of the voice assistant is getting close to year of Linux on desktop.

What you’re promising has been promised time and time again, received endless hype cycles then collapsed once people realised the limits of the technology. Yes, this time the tech is much more capable than what came before but I’m inclined to believe we’ll yet again find a limit that means we’re using it for some things but our lives still aren’t drastically changed.

adriand · on July 3, 2023

What you’re missing is that with LLMs the chief obstacle with voice assistants changed overnight from “how do we develop a system that can easily interact in natural language” (at the time, a very hard and possibly unsolvable problem) to “how do we expose our systems to API-driven input/output” (a solvable problem that just takes time).

Case in point, I asked Siri to change my work address. She stated that I needed to use the Contacts app to do that. This is not very helpful. The issue here is not Siri’s inability to understand what I want, it is that the Contacts app does not support this method of data input. Siri is also probably not very good at extracting structured address information from me via natural language, but the new LLMs can do this easily.

afavour · on July 3, 2023

> The issue here is not Siri’s inability to understand what I want, it is that the Contacts app does not support this method of data input

…which is something an LLM won’t help with.

“Just design an open ended API capable of doing absolutely anything someone might ask ChatGPT to do” is not the simple task you’re making it out to be!

There's a reason why people describe ChatGPT as a "research tool": you often need to do a bunch of iterations to get it to do the correct thing. And that's fine because it's non-destructive. But it's very far from a world where you can let it loose on a production, writable database and trust that it's going to do the correct thing.

ben_w · on July 3, 2023

I'm sure I've seen a headlock that someone connected their screen reader to GPT and it totally could do that kind of thing…

No idea how well, so I assume "badly"; but the API is already there.

ben_w · on July 3, 2023

(headline, not headlock; and now too late to edit)

layer8 · on July 3, 2023

50% of the time Siri’s inability to understand what I want is the issue, and I don’t even try that much, given the bad experience.

andsoitis · on July 3, 2023

> The era of having a personal voice assistant that is capable, adaptable, and intuitive is VERY close and that is something that's exciting.

Intuitive to use? Or has intuition?

janoc · on July 3, 2023

And that actually anyone wants?

Google and Amazon have tried to sell theirs for a long time. And none were actually selling much. Amazon admitted to be selling theirs at a loss. Facebook has tried their own - and quickly cancelled them. Google's is in every Android device - and yet pretty much nobody uses them. Even Apple's Siri is more annoyance than help.

That something can be built doesn't mean it will sell or that people will actually want to use it. If you create a solution looking for an imaginary problem that your marketing thinks is what people want instead of a solution that solves a real existing problem, you do get a solution looking for a problem ...

Also, answering questions and communicating in natural language is the easy part of such assistant. For the thing to be useful it must be able to actually do something too. Which is incredibly difficult beyond the (closed) ecosystem of its vendor. Thirdparty integrations are usually driven by who pays the manufacturer for the SDK and partner contract (seen as a marketing opportunity), not by what the users actually want it to integrate with. Hoping for one of these with an open API that anyone could integrate whatever they want with, I am not holding my breath here.

pksebben · on July 3, 2023

> Hoping for one of these with an open API that anyone could integrate whatever they want with, I am not holding my breath here.

OpenAI is already on it. The latest gen of GPT-3 and -4 are finetuned to respond to "do this thing" commands with JSON structured to:

- provide the name of a given function call

- provide arguments to that function call

it's "early stage", which in this case probably means "good enough to be useful within a month or two", given the rate at which these things have been developing.

Anecdotally, I've been playing with giving the models instructions like:

"When asked to perform a task that you need a tool to accomplish, you will call the tool according to its documentation by this format:

TOOL_NAME(*args)

Below you will find the documentation for your tools."

...and I've gotten it working pretty damn well (not even with the JSON-finetuned models, mind you). All you really need is python-style docstrings and a minimal parser and you're off to the races. I recommend anyone interested play with it a bit.

aarong11 · on July 5, 2023

Just before the point they built this, I was already chaining queries together to do this. I built a plugin system with bits of JS code that are eval'd and arguments injected.

They couldn't have released this at a better time, I have about 30 plugins and i'd say it manages to get the right one about 90% of the time as opposed to about 70 with my hacked together version (but I guess I wrote it and know what to say so maybe that's a bit skewed)

pksebben · on July 6, 2023

I've found that GPT really like "google style" python documentation. You need to have a chunk of system prompt explain that it should be 'using the tools according to their documentation etc etc', but once you've dialed that in a little stuff like this works a charm (the docstring is what the LLM sees):

@Tool

def add_todo(title, project=None) -> str:

    """
    Add a new TODO.

    Args:
        title (str): A brief description of the task
        project (str, optional): A project to add the todo to, if requested
    """

    logger.debug(f"Adding task: {title}")
    task = Task(tw, description=title, project=project)
    task.save()
    return f"Added [ {title} ] to the { project + ' ' if project else '' }TODO list."

janoc · on July 3, 2023

And everyone will want to funnel their data and pay OpenAI/Microsoft in order to be allowed to implement a basically slightly better Alexa?

Dream on.

This is not a technical problem, this is a business problem. Sadly a lot of engineers don't understand that.

pksebben · on July 6, 2023

Oh, I think you've misunderstood me. Business problems are someone else's gig - I have no intention in making this a product or making money off it. It's for me.

The thing is, I've managed to get this working as an interface for a whole segment of stuff that was a pain in the ass before. My task list is all in one place for the first time, and it talks! With words! I have a pair programmer, who is excited to do stuff, on the command line, 24/7. They also have encyclopedic knowledge of anything that isn't a super deep cut, so I can move through more spaces and find solutions that I never would have dreamed of due to the cognitive load of sifting through textbooks and documentation just to create a [ insert more or less anything here].

If you're looking at the folks here who are getting excited and wondering "What's up with *them*?, this is it. It's not about the Next Big Thing so much as it's about "Holy shit, computers are magic again". For themselves.

Of course, I can speak for some of us. For sure, the hungry lets-make-a-startup folks exist and are currently working on doing that - and that's fine. But to me that's boring. Commerce and markets and economies are toxic to creativity. I've tried Bing-with-GPT and it's AWFUL compared to GPT-4, despite being sorta the same underlying thing.

I'm perfectly happy paying OpenAI to use the thing they built, for myself, for now. I am seriously looking forward to migrating to locally run models, once we get there (and we will).

amanaplanacanal · on July 3, 2023

Early stage might mean good enough to use in a month or two or it might mean “full self driving this year”. There isn’t any way to tell until it happens.

hunter2_ · on July 3, 2023

There might be sufficient overlap between all such concepts that a distinction hardly matters anyway: if the assistant says what's most likely to come next according to a LLM, or if a person says what they think should come next based on intuition, the listener would probably find each to be about equally intuitive to converse with due in large part to each of those qualities.

andsoitis · on July 3, 2023

> There might be sufficient overlap between all such concepts that a distinction hardly matters

“Intuitive to use” roughly means that it is easy for a human to interact with.

“Intuition” is the ability to understand something immediately, without the need for conscious reasoning.

JohnFen · on July 3, 2023

> Intuitive to use? Or has intuition?

I don't really see either of those things as a real possibility. Within my lifetime, anyway.

rvz · on July 3, 2023

> Crypto still hasn't proven itself to be useful in any way shape or form that isn't immediately over-shadowed by a different medium.

Seems like it has proven very useful for Stripe [0], Moneygram [1], TicketMaster [2], etc.

Unlike AI which continues to consume tons of resources to burn the entire world down to the ground without any viable efficient methods of training, inference or fine-tuning their AI models in the past decade with its chatbot hype and gimmickry [3], crypto does not need to consume tons of CO2 to operate, thanks to alternative and greener consensus algorithms available in production today. [4]

Being 'useful' is not an excuse to destroy the planet around untrustworthy AI models getting themselves confused over a single pixel or hallucinating in the middle of the road.

[0] https://stripe.com/gb/use-cases/crypto

[1] https://stellar.org/moneygram

[2] https://business.ticketmaster.com/business-solutions/nft-tok...

[3] https://gizmodo.com/chatgpt-ai-water-185000-gallons-training...

[4] https://consensys.net/blog/press-release/ethereum-blockchain...