What is your opinion on the current state of AI/LLMs

Chippys_mittens@lemmy.world · 1 day ago

What is your opinion on the current state of AI/LLMs

Alsjemenou@lemy.nl · edit-2 3 minutes ago

LLM’s have now had a pretty decently long period of proving their worth. Which turned out to be very limited in scope and depth, at least compared to the promises given beforehand.

For example, it was predicted that it would be able to write and inject code into itself, generate data to train on for itself, not need any/minimal human intervention to do so. This clearly is impossible.

As a tool for people to use natural language to interact with software, it’s proving to be quite effective.

As a tool for accurate dissemination of factual information it isn’t reliable at all. And can’t be made reliable, LLM’S are at least incapable of reliability at a fundamental level. As language in itself is a subjective human invention we describe the objective reality with, the objective reality is only known through perception. A LLM doesn’t in fact perceive anything, it’s not alive. So fundamentally LLMs can’t know if they are actually being factual, this requires something more than language.

People who peddle AI bs, don’t know, or wish to remain ignorant about, the fundamental limitations of language.

wewbull@feddit.uk · 23 minutes ago

They are a terrifying vector for disinformation - one that only the rich and powerful can create. People generally don’t understand that LLMs 1) will lie to them, and 2) can be tuned to spread any message the owner of the model wants.

AdamBomb@lemmy.world · 20 hours ago

They’re useful and getting better, but they’re improving by burning more tokens behind the scenes, and the prices they charge only cover a fraction of the cost. Right now there is no foreseeable path to profitability.

jtrek@startrek.website · 20 hours ago

It enables unskilled people to punch above their weight class, similar to giving a chainsaw to a toddler.

I’ve used them a little for coding, but it’s not always correct. It’s often incorrect in subtle ways. Or inefficient in non obvious ways. It gets worse as you build more.

Often it’s better overall to do it yourself if you know what you’re doing. If you stick to letting the LLM do it, you won’t learn much.

MagicShel@lemmy.zip · 23 hours ago

They are useful. My teams are seeing modest productivity gains by self reporting, but I’m going to give it another six months to see if it shows up in actual metrics.

I’m enthusiastic about AI but I remain skeptical. I don’t mean to always be contrarian but I’m dead in the middle and everyone who says they are great or terrible I tend to offer my experiences in the other direction.

They are not to be trusted to handle customers directly, but they can assist experts when they have to step out of their expertise. For example I can’t write Python, but I’ve been coding for 30 years. I can certainly write some good directions on what needs to be done and I can review code and correct it. So AI has let me write a bunch of complex Python scripts to automate minor parts of my job to let me focus on the hard parts.

For example I can execute GDPR delete requests in a few moments where doing it by hand with Hoppscotch or Postman probably takes me 5-10 minutes. We have a multiple systems and sometimes I have to delete multiple profiles for a given request.

It’s great at rubber ducking as long as you think critically about its proposed solutions. It’s fine at code review before sending it to an actual person for review. It flags non-issues but it also flags a few actionable fixes.

The important thing though is to never trust it when it comes to anything you don’t know about. It’s right a fair amount of the time, depending on what you ask, but it’s wrong enough that you should never, ever rely on it being right about something. The moment you put your life in its hands, it’ll kill you with nothing to say to the survivors but, “Your right about that. Sorry, that was my mistake.” And it isn’t even sincere. Because it can’t be. Because it doesn’t think or feel anything.

statelesz@slrpnk.net · 23 hours ago

Great answer.

CodenameDarlen@lemmy.world · 1 day ago

They’re annoying to be honest.

I used Qwen 3.5 for some research a few weeks ago, at first the good thing was every sentence was referenced by a link from the internet. So I naturally thought “well, it’s actually researching for me, so no hallucination, good”. Then I decided to look into the linked URLs and it was hallucinating text AND linking random URL to those texts (???), nothing that the AI outputs was really in the web page that was linked. The subject was the same, output and URLs, but it was not extracting actual text from the pages, it was linking a random URL and hallucinating the text.

Related to code (that’s my area, I’m a programmer), I tried to use Qwen Code 3.5 to vibe code a personal project that was already initialized and basically working. But it just struggles to keep consistency, it took me a lot of hours just prompting the LLM and in the end it made a messy code base hard to be maintained, I asked to write tests as well and after I checked manually the tests they were just bizarre, they were passing but it didn’t cover the use cases properly, a lot of hallucination just to make the test pass. A programmer doing it manually could write better code and keep it maintainable at least, writing tests that covers actual use cases and edge cases.

Related to images, I can spot from very far most of the AI generated art, there’s something on it that I can’t put my finger on but I somehow know it’s AI made.

In conclusion, they’re not sustainable, they make half-working things, it generates more costs than income, besides the natural resources it uses.

This is very concerning in my opinion, given the humanity history, if we rely on half-done things it might lead us to very problematic situations. I’m just saying, the next Chernobyl disaster might have some AI work behind it.

Buckshot · 24 hours ago

Had the same research issue from multiple models. The website it linked existed and was relevant but often the specific page was hallucinated or just didn’t say what it said it did.

In the end it probably created more work than it saved.

Also a programmer and i find it OK for small stuff but anything beyond 1 function and it’s just unmaintainable slop. I tried vibe coding a project just to see what i was missing. Its fine, it did the job, but only if I dont look at the code. Its insecure, inefficient, and unmaintainable.

CodenameDarlen@lemmy.world · 24 hours ago

I agree, I assumed this error was LLM related not Qwen itself. I think LLMs aren’t able to fit the referenced URL within the text extracted from it. They probably do some extensive research (I remember it searched like 20-40 sites), but it’s up to the LLM if it’ll use an exact mention of a given web page or not. So that’s the problem…

Also it’s a complete mess to build frontend, if you ask a single landing page or pretty common interface it may be able to build something reasonable good, but for more complex layouts it’ll struggle a lot.

I think this happens because it’s hard to test interfaces. I never got deep into frontend testing but I know there are ways to write actual visual tests for it, but the LLM can’t assimilate the code and an image easily, we’d need to take constant screenshots of the result, feed it back to the LLM and ask it to fix until the interface matches what you want. We’d need a vision capable mode more a coding one.

I mean you may get good results for average and common layouts, but if you try anything different you’ll see a huge struggle from LLMs.

leoj@piefed.social · 24 hours ago

For context and to your knowledge of the field, is Qwen 3.5 supposed to be cutting edge?

CodenameDarlen@lemmy.world · 24 hours ago

It’s the best open source model, pretty next to Claude on benchmarks.

TootSweet@lemmy.world · 23 hours ago

Is Qwen really Open Source, or do they just let you download weights? (Like LLaMa.)

CodenameDarlen@lemmy.world · 23 hours ago

Not sure now, but it says Apache 2.0 in their GitHub repo.

WolfLink@sh.itjust.works · 24 hours ago

Qwen 3.5 is one of the best of the open-weight (self-host able) models right now. It’s not as good as some of the extra massive proprietary models like the bigger Claude models.

leoj@piefed.social · 24 hours ago

ah ok, I have some experience hosting Ollama and of course stable diffusion, but haven’t really messed with too many others, thanks for the insight!

WolfLink@sh.itjust.works · 24 hours ago

Qwen 3.5 can be run via ollama

leoj@piefed.social · 24 hours ago

well now I have something to do this weekend if the weather is poor, thank you!

venusaur@lemmy.world · 23 hours ago

You’re gonna get a very anti-AI bias on here

TootSweet@lemmy.world · 22 hours ago

venusaur@lemmy.world · 19 hours ago

AI is reality. It’s just math and rules.

TootSweet@lemmy.world · 18 hours ago

venusaur@lemmy.world · 17 hours ago

That’s a human problem from how tokenization was designed. It’s still reality. It’s a real response from real code.

TootSweet@lemmy.world · 58 minutes ago

So it’s real except in all the ways that matter. Got it.

Chippys_mittens@lemmy.world · 21 hours ago

That’s fine I figured I would but might learn something regardless

nonentity@sh.itjust.works · 20 hours ago

LLMs exist, AI doesn’t.

Anyone who calls LLMs ‘AI’ is betraying they don’t understand what the labels are. Their opinions on the subject should be summarily dismissed, and ridiculed if they persist.

LLMs have vanishingly narrow legitimate use cases, none of which have proven justifiable to be wielded unsupervised.

BlameThePeacock@lemmy.ca · 16 hours ago

That last thing could be said for most humans, especially those in the lower salary brackets. We still employ them in droves, and supervisors to supervise them too.

sach@lemmy.world · 15 hours ago

It could be said, but it’s hopefully not the way humans see other people, life is not for working but work is for living, and a person’s value is not determined by the economic output they provide.

TootSweet@lemmy.world · 1 day ago

They’re a straight up scam.

Chippys_mittens@lemmy.world · 24 hours ago

How so?

TootSweet@lemmy.world · edit-2 23 hours ago

They just don’t do anything useful, and the hype-ers are acting like they’re AGI. Hallucinations make them too unreliable to be trusted with “real work”, which makes them useless for anything beyond a passing gimmick. Vibe coded software is invariably shit. Doing any serious task with “AI assistance” ends up either taking more work than doing it without LLMs or sacrificing quality or correctness in huge ways. Any time you point this out to hype-ers, they start talking about “as AI advances” as if it’s a foregone conclusion that they will. People talked the same way about blockchain, and the only “advancements” that have been made in that sphere are more grifts, and meanwhile it still takes anywhere between 10 minutes and an hour to buy a hamburger with Bitcoin, and it gets worse with greater adoption. Just like you can’t make a distributed blockchain cryptocurrency that resolves discrepancies automatically without relying on humans fast at scale (and even if you could make it fast, it’d introduce at least as many problems as it purports to “solve”), you can’t make LLMs not hallucinate. The only way to solve hallucinations is by abandoning LLMs in favor of a whole different algorithm.

If anything LLMs have blocked us from making progress toward AGI by distracting us with gimmicky bullshit and taking resources from other efforts which may otherwise have pushed us in the right direction.

Mind you, “AI” is a very old term that can mean a lot of different things. I took a class in college called “Introduction to Artificial Intelligence” in… maybe 2006 or 2007. And in that class, I learned about the A* algorithm. Every time you played an escort mission in Skyrim and had an NPC following you, it was the A* algorithm or some slight variation on it that was used to make sure that NPC could traverse terrain to keep roughly in toe with you despite obstacles of various sorts. It’s absolutely nothing like LLMs. It doesn’t need to be trained. The algorithm fully works the moment it’s implemented. If you want to know why it made a particular decision, you can trace the logic and determine exactly why it did what it did, unlike LLMs. It’s for a few very niche purposes rather than trying to be general purpose like an LLM. It requires no massive data centers and doesn’t consume massive amounts of memory. And it doesn’t hallucinate. The AI hype-ers (and the media who have mostly fallen for their grift hook, line, and sinker) love to conflate completely unrelated technologies to give the impression that LLMs are getting better because such-and-such article mentions an “AI” that discovered a groundbreaking new drug. But the kind of AI they use to find drugs is very special purpose and has nothing to do with how LLMs work.

LLMs can’t do your job, but the grifters are doing a damned good job of convincing your boss that LLMs can in fact do your job. As Cory Doctorow says, the current AI craze “is the asbestos that we’re shoveling into our walls”. We’re causing huge problems with it and if/when the bubble properly pops, we’re going to spend a long time painstakingly extracting it from our systems, replacing it with… you know… stuff that actually works, and repairing the damage it’s done in the meantime.

Meanwhile, it’s Nvidia and OpenAI and so on who are boosting the LLM bubble. And they’ve made a shit ton of money off of their grift at the expense of everyone else. How anyone can look at all this and not think “scam” is beyond me.

LumpyPancakes@piefed.social · 22 hours ago

I have a vague memory that Bitcoin used to be instant in the first versions - or at least with near certainty that the advertised transaction was real, but that the protocol was later modified in such a way that this mechanism was no longer reliable. It might have been enshittified.

AI is still largely affected by garbage in garbage out.

leftzero@lemmy.dbzer0.com · 12 hours ago

AI is still largely affected by garbage in garbage out.

Exactly. When it comes to code, for instance, what percentage of the training data is Knuth, Carmack, and similarly skilled programmers, and what percentage is spaghetti code perpetrated by underpaid and uninterested interns?

Shitty code in the wild massively outweighs properly written code, so by definition an LLM autocomplete engine, which at best can only produce an average of its training model, will only produce shitty code. (Of course, though, average or below average programmers won’t be able — or willing — to recognise it as shitty code, so they’ll feel like it’s saving them time. And above average programmers won’t have a job anymore, so they won’t be able to do anything about it.)

And as more and more code is produced by LLMs the percentage of shitty code in the training data will only get higher, and the shittiness will only get higher, until newly trained LLMs can only produce code too shitty to even compile, and there will be no programmers left to fix it, and civilisation will collapse.

But, hey, at least the line went up for a while and Altman and Huang and their ilk will have made obscene amounts of money they didn’t need, so it’ll have been worth it, I suppose.

snoons@lemmy.ca · 1 day ago

They might be good given time, probably a lot of time, but right now all they’re doing is allowing that well meaning roommate that puts your cast iron in the dishwasher to also ruin Wikipedia articles and fuck up open source projects.

stoy@lemmy.zip · 1 day ago

I hate it.

I am an IT guy, and AI has just about killed my enthusiasm for tech, I made a post about it a month or two ago, and it is still valid.

Norin@lemmy.world · 24 hours ago

They’re digital yes men, mostly, and really lack in the nuance when you prompt them to answer on anything you have a deep knowledge of.

Chippys_mittens@lemmy.world · 24 hours ago

Makes sense

CompactFlax@discuss.tchncs.de · edit-2 1 day ago

They’re pervasive in an annoying way, and the boosters are using them for utterly ridiculous things.

They have their very limited uses. For short things they can be useful, within reason. “How do you take these results and transform them into X in Python” then take a very squinty look at it and figure out where it went wrong. Then, try asking a couple follow-ups and the code just scrambles.

For writing I’ve found they’re pretty useless, because I can’t figure out how to prompt them to not sound like they’re in the marketing department and blowing smoke.

But they can be a good starting point for finding information when I’m looking for something that’s really a Reddit question, rather than something I can summarize into keywords for a search engine. Still, too often useless.

I recently had someone send me “is it cheaper to air bnb or get a hotel at $destination” and it was absurdly incorrect, as in off by a factor of two. When it would have taken mere seconds more to get correct information. I have relatives who work in professions which literally define accuracy (accounting and law) and they rely on them for stuff like that, and it’s so provably incorrect

Scipitie@lemmy.dbzer0.com · 1 day ago

In case you wanna give it a shot: I gave writing samples of myself from chat and emails to a self hosted LLM, telling it to extract the writing style deviations, key elements, common phrases, symbols, patterns, etc. Then gave that as a “answer it this style” system prompt expansion - works like … Quite okay. Still need to go over it or course but it doesn’t sound like marketing bullshit but conveys what I want.

Completely agree with your general assessment though! They’re getting better but the marketing machinery is crazy in their claims.

leoj@piefed.social · 24 hours ago

Yeah in my experience the better your input (original writing) and prompt, the better it does, although I think it really depends on what you’re looking for.

I can be a bit… Verbose…? So when there is a character or text limit I will use an LLM to shorten or condense my thoughts, which has turned out fairly well, but I bet the quality degrades quickly as the inputs degrade in quality.

harmbugler@piefed.social · 19 hours ago

Caveman plugin: why use many token when few do trick

Chippys_mittens@lemmy.world · 1 day ago

They are definitely insanely pervasive right now.

Hackworth@piefed.ca · 1 day ago

As a video producer, the AI baked into the Adobe suite is very useful (generative fill, harmonize, and neural filters in Photoshop, generative extend and AI noise reduction in Premiere, lots of older stuff in After Effects).

As far as LLMs go, I get a lot out of talking through things with Claude, or coding silly little toys that only matter to me. But I’d never trust an agent with tools or access. And Anthropic’s own research is a good place to start for why that won’t change anytime soon.

Chippys_mittens@lemmy.world · 1 day ago

Interesting, thank you!

statelesz@slrpnk.net · 23 hours ago

Today I tasked Gemini Pro to assist me code a quite simple web GUI in Python using NiceGUI and besides somewhat doing what I asked it to do it also added a bunch of childish emojis to buttons and removed my name from the project and replaced it with ‘admin’. This is a real tool that I develop for a hand full of my very real coworkers and my boss is paying Google for this shit. Next time I much rather give the task to one of our apprentices and point them to the docu then having a supposedly ‘Pro’ model do random shit I haven’t asked it to do.