Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
I recorded a screen capture of a task. Gemini generated code to replicate it (twitter.com/dynamicwebpaige)
112 points by Michelangelo11 on Feb 22, 2024 | hide | past | favorite | 93 comments


I feel like most engineers on hackernews/reddit have been underestimating AI tech. People always say it sucks and it gets starts hallucinating nonexistent libraries and what not but we are still in the early phase of this technology. And that how it could never replace them as AI will make current engineers more efficient.

I've been scared of AI since seeing chatgbt a couple of years ago. I feel like it's only a matter of time until a dev can feed an AI machine it's entire code base and business requirements. And then a separate AI could carry out the manual/integration testing tasks. AI could potentially cut down the number of devs required to maintain a webapp or ios app after it's built.

I feel triggered by this post especially because I've made it a career writing automation code haha.


HN also underestimated cryptocurrency craze. Yet years later, it is busted.

Seeing the same patterns with AI. Every startup is now incorporating “AI” or “deep learning” or “OpenAI” into their decks/motto/pitch.

Have yet to see anything worth using beyond the initial hype. Using AI to me is like learning another programming language. Same shit. Different interface.


A lot of the AI programming things you can do would be easier to either

1. type out yourself 2. copy and paste from StackOverflow 3. find a library that does the thing you want

It's not like CoPilot is any better. It's like when Microsoft and Google tried to force text completion on emails, it just gets in the way and makes me lose my train of thought.

AI is really great for very specific tasks that would be difficult to incorporate into a traditional algorithm. I really like Photoshop's background removal tool for example. But general purpose AI to me is blown out of proportion in terms of hype. Not everything needs iPhone levels of scaling. 3D printing, VR, Web 3, Cryptocurrencies, NFTs, Metaverse, AI. The list goes on. These things have niche use cases. AI is great for a lot of niche use cases (video upscaling, for example). But for general purpose software interaction? Maybe not.


> It's like when Microsoft and Google tried to force text completion on emails, it just gets in the way and makes me lose my train of thought

I have been using a paper notebook to take my notes for a while and I like that I can remember what I scribble spacially.

Recently I decided to also use the notebook to sketch the really important e-mails - the ones you send to people either really high up or that you value a lot but can't reach often - in paper. I have been able to scribble rather quickly in paper and come up with concise but also complete write-ups and also noticed I am happy to not be looking at the computer screen.

I started this because I noticed there was a lot of noise going on when using outlook with all the notifications popping-up and the hard to understand new interface that just scales like ass and becomes unreadable in my 4k laptop, and the autocomplete kept axing my thoughts.


Is cryptocurrency busted? Sure, it didn't replace the global monetary system, but it seems to be a proven technology now. And sure, it had its crashes, but BTC is up 112% over 1Y, and lots of cryptos have legitimate payment purposes.


> it seems to be a proven technology now.

Proven to do what exactly? A solution for which problems?

Cryptocurrency had billions invested into it and the most valuable product that came out of it was cartoon apes. There's not a single blockchain that replaced the existing financial system, an area it was supposed to disrupt. There's not a single blockchain in use anywhere in logistics and supply chain, an area the blockchain was allegedly going to revolutionize. There's not a single blockchain being used for identity management by serious enterprises.

Most new technologies are overhyped, but blockchain/cryptocurrency is the only one where you can look back and be astonished at how virtually nothing was created. Its most important lasting contribution to society is providing incontrovertible evidence that just because "thought leaders" and deep-pocketed investors say something is going to change the world, it doesn't make them right.


Whether or not you agree with the greater philosophical goal of Bitcoin, Bitcoin proved the ability to create a system of value using basically just cryptography and with a built-in reward system to incentivize the decentralization of the network -- and it grew enough that it can now operate as a standalone method of payment. It's an amazing feat.

Snowden is correct here: https://twitter.com/Snowden/status/1759304612664779247


It had done all that by 2010, no? The real hype was about what would come after.

I don't think the thesis of most Bitcoin critics is that the math and social engineering involved aren't interesting.


I don't disagree with you. But the original post was about 1) cryptocurrency as a whole being busted and 2) that cryptocurrency had never provided any kind of value. Those are what I disagree with.


Yeah it just requires massive investment in computer components, ridiculous amounts of carbon burning, servers in the Arctic to lower cooling costs, and a few towns full of people subject to more sound than jet engines at all hours of the day. Innovation!


> Bitcoin proved the ability to create a system of value using basically just cryptography and with a built-in reward system to incentivize the decentralization of the network

AKA micro transactions in games.


No, because micro transactions in games is a centralized system involving zero cryptography, usually powered by a centralized database by the game developer. Bitcoin is decentralized.


> Bitcoin is decentralized.

Barely, if at all. Technically it can be decentralized, but Bitcoin's design promotes centralization, and the result is that currently Bitcoin can be controlled by either two entities (AntPool and Foundry USA) or three (one of the former ones plus F2Pool and ViaBTC) - and it's not going to get better.


... A solution for which problem...


Good luck sending 10 BTC worth of USD from a US bank to China or Russia, or ANY bank without raising eyebrows and getting your bank account locked/frozen


You don’t send bitcoin to a country! Maybe the act of buying the bitcoin gets your account frozen but certainly not the sending.


This is pretty much trivial.


Well, I hold crypto, but admittedly it's kind of failed as far as payments are concerned.

I used to use BTC quite regularly for payments back in like 2014-2015. But now, it's too expensive to move around due to fees so I just hold it. The same can be said even of ETH - the fees are too high to interact with a lot of the DeFi stuff.

There are 'layer 2' solutions that tackle the fees issue but uptake of these has been slow.

It's why they pivoted to calling BTC a store of value years ago. The payments side of crypto is kind of disappointing.


Maybe a better example would be the more general "blockchain". There was all sorts of talk just a few years ago about how it was the most revolutionary technology since the Web and every business was going to be affected by it.


> but BTC is up 112% over 1Y

What's the risk-adjusted return and correlation with other securities? Those are both more important than share price.

Since it doesn't pay dividends in USD, its USD value is only achieved if you go around convincing everyone else to stop selling it so that you can sell it. Which I think conflicts with it being used for payments.


As a consumer there is practically speaking zero legitimate payment purposes available to me. Sure there are a couple of niche services taht offer BTC payments alongside credit cards...but my grocery store does not accept BTC.


Just like cryptocurrencies, transformer models have their valid use-cases, but are being so massively overhyped and oversold that it prompts natural resistance and spoils the whole thing. The result is that whenever someone talks about "AI" (or "crypto"), it's just safest to assume that they have no idea what they're talking about and want to sell it to you because more people getting hyped is how they get their investment back.


Our grandfathers underestimated the steel manufacturing craze and a century later, it is busted.


It's actually worse than that.

Why do we need web apps or iOS apps? They're just task specific computer interfaces.

It's possible that this kind of AI tech eliminates the need for task specific computer interfaces at all.

You don't need to tell the LLM to code up a TODO list app so you can sell it in the App Store.

The user doesn't even need to tell an LLM to make them a TODO list app.

Given an LLM that can persist and restore context, the user can just use the LLM as a personal assistant that keeps track of their TODO list.

Whatever the software is that we're working alongside our AI colleagues to build in ten years' time, I don't think it's going to be automated tests for apps and websites.


I’m somewhat skeptical, at least for LLMs per se. They are language models. They have, in many respects, superhuman abilities, but I wouldn’t really want to trust things that require accuracy to an unaided human or even an unaided superhuman.

I would expect better results from LLMs using programming languages, perhaps ones tailored to LLMs, to prepare tasks on behalf of their users.

(Also, LLMs doing anything direct are an incredibly inefficient use of computing resources. There are quite a few orders of magnitude of difference between the FLOPs needed to do basic calculations and the FLOPs needed to run inference on a large model that may be able to do those calculations if well trained.)


The thing is, a lot of tasks don't require accuracy.

We force accuracy on them when we computerize them, because computers historically haven't handled ambiguity well. That demands the skill of 'programming' - interpreting fuzzy real world problems, making them precise enough to be modeled in a way that classical computing can handle, and making computer routines to help.

But the underlying problem humans are looking for a bicycle-for-the-mind to make easier often didn't start off 'precise' at all.


Task interface is for human though, when LLM goes wrong, you can intervene. But I agree, it will be less critical moving forward.


This stage is called denial. There's also anger, bargaining, depression, and finally, in the end, acceptence. I'm saying in a funny way, but I suspect it's true.

Who can say how AI will develop, but beware of happily-ever-after stories. It could be a nightmare for all of civilization, for just engineers, or just not develop much futher.


The problem with this is that it could be a universal argument against any skepticism. So unless skeptics are always wrong, it doesn‘t really work


Absolutely! The "stages of grief" are just the stages of approaching a problem. Grief is just special because it is an unsolvable problem and so you go through all of the stages. Entering one or more stages of this process does not mean that you are going to go through all of them. In most non-grief cases it stops earlier.


That's a very interesting approach!

Ideally, skepticism has none of that: Denial is arguably not part of skepticism - it lacks skepticism of yourself (i.e., denial includes certainty). Skeptics has nothing to do with anger, or bargaining (you can't negotiate truth); and depression and emotional acceptance also are out of place.

Skepticism is unemotional in that its definition doesn't reach into emotion. But skeptics - all humans, as far as we know - are emotional creatures, and beyond a doubt those emotions play a role in driving much skepticism.

Which returns me to my point: Many responses to AI are driven more by those emotions than by skepicism.


Nah, it will bifurcate the industry between people who have a mental model of how things work and people who only know how to press the code dispenser button. Offshore teams will be able to sling 10x times the slop 10x times faster.


And those who know what they're doing will have reliable work fixing the slop they made, but for less pay. (I say this because, judging by pay, building a new website, no matter how poorly, is more valuable then doing something like maintaining the banking systems.)


This is true.

The gap between seniors and juniors just gets bigger.


>Offshore teams will be able to sling 10x times the slop 10x times faster

Or it will reduce the demand for inexperienced juniors and offshore teams since you can replace their slop with AI slop.


Well, it's a good thing seniors are hired from the void fully trained. /s


You laugh, but where I live the market for juniors (other than summer internships exclusively for currently enrolled university students) is basically zero right now.

Every company is looking exclusively for seniors or at least mid level. They don't care that someone needs to train juniors in order to become seniors as long as it's someone else who has to do it. Companies can stay irrational longer than you can stay solvent.

So everyone keeps telling me how hot the SW dev market is right now due to all the openings and the high demand, meanwhile I'm only getting rejections because I don't have 3+ YoE in AWS, Kubernetes, Django/Flask, System Design, etc.


Who is saying the market is hot for the employees right now? It's dead as a doornail.


This is a problem that a lot of fields have had (e.g. hand-crafted furniture). Technological advances killed the market for juniors but demand for seniors remained. In the short term it's completely fine, but then a few decades later there's not enough new seniors to match demand.


What is the correct quantity of "senior" furniture craftsmen? How do we know there is a shortage? Unless there has been enormous wage inflation in the past few years then I am skeptical of any shortage claims.


And this is the worse it will ever be at doing this task, it's impressive in a scary way.

It's extra funny when you think that resilience was the first thing Sam Altman answered when asked what kids should be learning today .https://youtube.com/shorts/OK0YhF3NMpQ


We're building an AI Agent that can perform manual testing on feature branches [1]. I can tell you, it works, and it's going to get better, and it's going to happen fast. It's not hard at all for an AI to read text on the screen and click it.

What's amazing is the social impact this has - often people don't believe it's real. It feels like when I had to explain to my parents that in my online multiplayer game, that the other characters were other kids at home on their own computers.

I think it's a matter of denial. Yes, software is made for humans and we will always need to validate that humans can use that software. But should a human really be required to manually test every PR in 10k person teams?

Again, as a founder of an AI Agent for E2E testing, we work with this every day. If I was a QA professional right now, I would watch the space closely in the next 6 months. The other option is to specialize in the emotional human part like in gaming. You can't test for "fun."

1. https://testdriver.ai. Demo: https://www.youtube.com/watch?v=HZQxgQ1jt4g


> You can't test for "fun."

Sounds intuitive, but there are gaming researches working on that regard. Two related terms (learnt from IEEE Conference of Games) that come to mind:

1. Game refinement theory. The inventors of this theory see games as if they were evolving species, so this is to describe how game became more interesting, more challenging, more "refined". Personally I don't buy that theory because the series of papers had only a limited number of examples and it is questionable how related statistics were generated (especially the repeatedly occured baselines Go and Mahjong), but nonetheless there is theory on that.

2. Deep Player Behaviory Modeling (DPBM): This is the more interesting one. Game developers want their game to be automatically testable, but the agents are often not ready or not true enough. Says AlphaZero for Go or AlphaStar for StarCraft II, they are impressive ones but super-human, so the agnet's behavior give us little insight on how the quality of the game is and how to further improve the game. With DPBM, the signature of real human play can be captured and reproduced by agents, and thus auto-play testing is possible. Balance, fairness, engagement, etc. can then be used as the indirect keys to reassemble "fun."


Your site says 'No more writing automated tests or waiting on manual testing.'

but this solution only appears to do e2e testing ignoring api and unit testing. Additionally, automated test are mostly used for regression testing not exploratory testing of new features where most bugs will be found.


Well, you say we underestimate but are you sure you are not the "underestimater"?

Amongst other things, one of the tasks I tried to put ChatGPT to was writing scripts in a not so popular dialect of a very popular language: PowerCLI (PowerShell). Gemini is even worse!

The issue is of course the relative lack of PowerCLI vs the huge body of PowerShell generic stuff. Hallucinations include invented function parameters and much worse. It doesn't help that PowerCLI and MS's Hyper-V effort (whatever that is) both have a Get-VM function etc.

These things are "only" next token/word guessers. They are not magic and they are certainly not intelligent. I do get great results in other domains and with a bit of creativity but you have to be really careful.

No need to feel triggered. Use these tools as best works for you and crack on but do be careful to be an engineer and critically examine the output from the tool.


> These things are "only" next token/word guessers.

This is precisely the kind of vacuous "this is technology, I know technology, this is simple" hubristic underestimation that's being called out.

There is no upper bound to the intelligence of a "next token/word guesser". You can end up incorporating an entire world model to your predictions to improve their accuracy, and arguably this has already happened, to a currently-unreliable and basic level. It is possible that no technological advances are required to reach better than human intelligence from this point -- only more compute, bigger models and datasets, and (therefore) better next-token predictions.


"This is precisely the kind of vacuous ..."

So I am devoid of anything? Nice. I slapped "only" within quotes to imply that there is more going on and a lot more complexity than implied by a naked reading of my comment. I'm sorry you missed that.

There is no notion of a bound or even intelligence for a LLM. It is a tool and no more - we know how they work - that is defined and we run our own. We can marvel at what looks like intelligence from the outputs but it isn't that. They can be enbiggened ad-nauseam but I very much doubt we'll get intelligence per se.

You might disagree with my arguments but please don't describe me as vacuous.


> So I am devoid of anything?

The completion was "vacuous [..] underestimation". Something you appeared to be doing in this one phrase you wrote, not something you are. I don't know anything about you as an entire person. Please try not to take criticism of some of your written thoughts so personally. I continue to take issue with characterizing the LLM as "a word guesser" because it implies a limit to capability that I don't think actually exists.

> There is no notion of a bound or even intelligence for a LLM.

When LLMs are outscoring humans on many/most standardized tests, including for tests where the questions are novel, I also disagree that there is "no notion of intelligence for an LLM". It feels like goalpost-moving, to the extent that I now have no idea what you actually mean when you say intelligence.

> we know how they work

I think this is also a hubristic statement. The researchers working on these systems do not speak like this. They say things like "we did reinforcement learning on question-answering in English and it turns out it answers questions in French too now and we were surprised and can't explain why that happens".


If you can automate my job--and I maintain we not only aren't there yet even but still won't be even if you make it so I never have to write another line of "syntax" ever again (as that isn't now and never was the job of a programmer)--you clearly can automate the jobs of most of the value chain above me, all the way up to the limited partners at the VC firms being paid management fees to pick portfolio companies... the issue here honestly isn't "programmers will be replaced and y'all are in denial", it is "humanity is going to have a crisis of purpose when capitalism can't figure out what to do with an entire civilization of meatbags that suddenly become cost in-effective for most any purpose" and I don't know what you expect anyone to do about that other than like, stock weapons or whatever (as there simply aren't enough jobs doing skilled physical labor to absorb enough people that betting on civilization feels like it makes sense) :/.


> what to do with an entire civilization of meatbags that suddenly become cost in-effective

Services! All sorts of things are going to become cost-effective that currently aren't.

Want a personal trainer? Motivational coach? Someone to sit next to you slapping your phone out of your hands every time you open social media? Personal runner doing errands? You can afford that now!

We're going to have an ever-shrinking pool of highly critical un-automatable people with an army of support folk keeping them running at peak productivity at all times.

You already see this trend in people who think of themselves as a business. They hire everyone from personal assistants to nannies. All in the name of "Well I make $200/h and there's this chore that costs only $50/h to delegate ..."


People keep forgetting that the value is not in the code itself, but in the solution that it realizes. Right now, you need developers to design and implement the code, and because they're humans, you need, people to manage them, people to manage everyone, people to manage salaries and expenses,... If you automate the job that is closest to the product, you remove the whole stack up to the founder. And history has proven that a founder is not really essential when the business is already running.


I would extend that from engineer to knowledge professionals in general. Our specialized tools, strange syntaxes, and invented worlds of code may shield us somewhat, but when you just want expertise and opinion typed out? That’s firmly in the LLM ballpark. LLMs are starting to nail memorization with perfect recall, beating humans. My own memories are always fuzzy and I need emails and notes and books to help me along. An LLM won’t need all that and even if it did it can access all of that much quicker than I.


I'm a "KM" "SME" and I can tell you that there is no doubt that every productivity gain in the last year has been $bigcorp employees feeding pseudo-anonymized business requirements and contexts into ChatGPT.

OpenAI + any half-assed data broker could easily infer the company, as I am sure they have already done.

All hail Microsoft, I am glad I chose the right AI megacorp overload early.


Have there been any measurable productivity gains in the past year?


Glueware and CRM people have been awfully productive, as it's a race to the bottom at this point.

You either let a robot tell you how to do your job better, or someone else will.


How do you know? Can we actually see that increased productivity in any objective economic metrics?


A collection of anecdotal evidence and a nearly in-explainable increase in "trivial" software middle-ware tasks that had been dormant until GPT arrived.

Now that the means, motive, and opportunity are there, combined with the a general uneasiness regarding employment opportunities, gains have definitely been on a sharp uptick.

Whether that's a first-order effect of people using generative models or a second-order effect of people believing they will be replaced by those who do; either way, the pressure is real, and the gains are material.

It may take a larger timespan and more samples, but I have little doubt middleware and other glueware is being rapidly "no-coded" by GPT models on the private computers of contractors.


I feel literally the opposite: the reason why I'm disappointed by the current state of AI is because I really want it to work so I can focus on the actually useful part of my job instead of doing boring stuff.

What's the useful part of my job? It's bringing values to my customers: they want to do something, and they need help. But today, unfortunately I must always tell them to reduce their ambition because they wouldn't be able to afford it or to be stuck waiting for the project to deliver even if they could fund it.

I've seen a enormous shift in developers ability to actually ship valuable products to customers when opensource became mainstream, thanks to github and things like npm: no more custom half-backed libraries for doing all the features necessary for the product to work, we could just re-use an existing library. More than half of our job disappeared these days, yet nobody regrets the days when you had to write your own code for absolutely every features, and the number of programmers exploded since then[1].

I wish AI assistant could be as impactful as github and npm, and I'm pretty sure they will eventually, and that day will be a great day for developers, not a bad one.

We're not going to lose our job, because from the perspective of the dude who hold the money, our job is to be the weirdo that talks to machines to deliver what the his ego want his company to be. Hence, the more you are able to deliver, the happier the money holder, and the more money you make. Our job will be threatened when the guys with money will be willing to actually make the effort of talking with the machine, but I don't see this day coming anytime soon. The ambition of man is unlimited, but his will to make efforts is in scarce supply.

The only realistic risk with AI, is big corporations grabbing all the benefits of the added productivity, this is a serious risk, and a very good reason not to be using OpenAI so we don't trigger a self-reinforcing feedback loop that give them a monopoly position where we all end up losing because we depend on them.

[1]: this has caused lots of sustainability issues for the library authors, but not for the developers using the libraries.


The machine has been getting easier to talk to over time. The machine also doesn't complain or get tired or sick.


Okay so what was that meltdown yesterday then?


> from the perspective of the dude who hold the money, our job is to be the weirdo that talks to machines to deliver what the his ego want his company to be. Hence, the more you are able to deliver, the happier the money holder, and the more money you make. Our job will be threatened when the guys with money will be willing to actually make the effort of talking with the machine, but I don't see this day coming anytime soon.

There's plenty of managers who prefer hacking an excel and vba than relying on developers; those will probably be more than glad of using some IA tool, to do something more advanced; of course mostly just for internal stuff, in the short term.

Anyhow many managers have a distaste for developers; they consider them overpaid slackers who'll lose time on anything, rather than producing money for the business.

They'll be more than glad to turn the job to the cheapest unqualified guy around, if with an LLM he can make something that looks passable


I'm frankly amazed its been tried/tested as much as it has. Sure, it's amazing compared to where it used to be, very facile with language, but it's really very bad a lot of the time at present, especially compared to an employee who's supposed to know something about what is being worked on. Skillful users can get benefit from it in narrow situations, but it really needs to improve a lot.


i don't see how the posted video is easier than just using the playback and recording tools that have been around for years


What's the new "just learn to code, bro" platitude we can throw around?


"Just learn to do plumbing bro."

Seriously, the demand for skilled handymen in the cities is insane as people can't do shit anymore. As per the South Park episode, you have to treat them well if you want them to pick up the phone or return your calls.


Think of how nice the world would be if we treated everyone well regardless of how much we needed them.


Of course. Blue collar work has always been discredited in the past few decades since if you were doing it you were seen as the one who failed to get in to college. How many devs did you ever see greeting the cleaning staff?

Unfortunately your status in society and the way you get treated comes from how much money your profession makes. SW devs were also not very popular until google and meta started the bidding war.


If an AI can do this, an AI can also watch my screen and blank out all ads and edit out the sound. The controversy and panic that will happen when such an AI is eventually revealed will be interesting.


I read something about this on Garbageday today:

> Case in point: the Arc Browser.

> For years, The Browser Company has been promising to save the internet. Its Arc Browser is a smart refresh of what a modern gateway to the web should look and feel like and it generated a lot of goodwill with early users. And then, earlier this month, they released their AI-powered search app, which “browses the internet for you.”

> The Browser Company’s new app lets you ask semantic questions to a chatbot, which then summarizes live internet results in a simulation of a conversation. Which is great, in theory, as long as you don’t have any concerns about whether what it’s saying is accurate, don’t care where that information is coming from or who wrote it, and don’t think through the long-term feasibility of a product like this even a little bit.

> But the base logic of something like Arc’s AI search doesn’t even really make sense. As Engadget recently asked in their excellent teardown of Arc’s AI search pivot, “Who makes money when AI reads the internet for us?” But let’s take a step even further here. Why even bother making new websites if no one’s going to see them?


AI can also watch your screen on Big Tech's behalf and market to you based on your porn watching habits. Or immediately alert interested parties of your torrent setup. Or allow a hacker access to your tax returns when they inevitably jailbreak said AI and it spews out the extremely sensitive personal information of billions of unaware people.


I mean, adblockers have existed for a while...


Buy that is the REAL issue.

AI is a slave in service to corporations, and those corps have a culture of three decades of depraved sociopathy, with comical levels of contempt for customers, people, humanity, and civilization.

AI is not your friend. Whatever service it provides is a Trojan horse or, at a minimum, surveillance.


Raise your hand if you know an actual position at an actual company that was made redundant by ChatGPT, DaVinci, Stable Diffusion or related tech


More and more board games are released with AI generated art. It's still controversial so larger companies are hesitant to join due to the risk of flack on social media but it's definitely a real thing replacing jobs at the moment. Fryx Games that made Terraforming Mars is probably one of the more well-known examples.

Art in general seems to be a high risk occupation because the stakes are low. It's a similar case for stock photography. In the gaming world, Embracer recently fired a bunch of artists with the intent of having fewer people be more productive with the use of AI generated content.



https://in.indeed.com/m/jobs?q=wayfair&l=&sc=0kf%3Afcckey%28...

wayfair laid off a bunch of US IT staff and are trying to build an AI in india to replace workers (I think, based on events and new job opennings).

edit:typos

adt'l: tbf, AI hasn't actually replaced jobs there yet (afaik)


Trying doesn't count.


Web repeating actually feels like a good fit for these current gen multimodal agents.

Unsure if the code was provided (logged out I only see the one post) so unable to speak to the code or how it runs, but judging by the poster’s profile I think I can trust they know when code runs correctly. Though it does appear they’re currently working at this specific agent’s parent company.

Honestly, this impresses me more in the work the selenium team has been doing than the llm that used their api.


They said it worked with modifications (the code got them 90% of the way)


AI is going to bifurcate the job market. Short of AGI, the current code gen tools seem to be plateauing at a junior SWE level. Any simple tasks that are done by American IT, and couldn't be easily off-shored to India will be AI tasks. Why even bother hiring junior engineers when those tasks will be covered?


I am really sorry to whinge but is there any way at all to avoid posting links to tweets? Visiting twitter is such a horrible soul destroying experience.

I can’t work out how to full screen the video if it’s even possible.

The resolution on the video is too poor for me to be able to read the code.

There are ads and troll comments everywhere.

It seems to want me to create an account.

It’s just horrific.

Is there really no better way to showcase some cool observation or story about Gemini doing something useful?


People post on twitter. It's a thing. Asking to avoid posting links to twitter is ridiculous


Point taken, but I'm not sure that it is so ridiculous. In some cases (obviously not this case) there might be a better source for the article. Or even changing the URL to nitter or similar would help! Keeping the discussions about this sort of thing active and spreading awareness is a valid way of trying to promote long-term quality content on HN rather than preventing it from turning into a cesspool.


Maybe but i agree wholeheartedly about how terrible twitter is. Absolute trash heap of an app.


I agree, I am really happy with blog posts, articles and pdfs but everytime there is a Twitter or YouTube post I just die inside


Just press the X in the top right corner


One thing I am now wondering is why bother with the Selenium script at all? Why not have the AI model describe the same things it would do in a Selenium script in detailed natural language, you could always store that in a DB or file with more efficiency than storing the video, and you could just feed the natural language description to a model for automation? And the major benefit is it is much easier for humans to review and modify if needed.


Eventually; openai, adept, etc. are working on these types of agents. But currently, name a model that can replace selenium (ie. engage with the browser)


Selenium scripts are already so flaky. Why would you want to add more ambiguity to them?


Has anybody tried feeding a LLM a bunch of business rules and having it generate a program that meets them?


I saw one of the Microsoft product owners give a talk on Power Apps at Atlanta Developer's Conference:

https://www.microsoft.com/en-us/power-platform/products/powe...

You can write just a few sentences, or even upload a drawing of what you want your app to look like, and it'll create it for you.

The catch is it's only for corporate apps. These can't be publicly facing. Everyone needs to sign in with M365 and have an individual license.


This is exciting yet a little scary TBH


Is the code actually working?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: