AI Training

ekZepp@lemmy.world · edit-2 9 months ago

AI Training

Monstrosity@lemm.ee · edit-2 9 months ago

The big difference is Deepseek is open sourced, which ALL of these models should be because they used our collective knowledge and culture to create them.

I like AI but the single biggest issue is how it is being gated off and abused by Capitalists for profit (It’s kind of their thing).

Dadifer@lemmy.world · 9 months ago

Open-weighted*

dependencyinjection@discuss.tchncs.de · 9 months ago

Can you elaborate on what you mean, for a layman?

Dadifer@lemmy.world · 9 months ago

The neural network is 100s of billions of nodes that are connected to each other with connections of different strengths or “weights”, just like our neurons. Open source weights means that they released the weight of connections between the nodes, the blueprint of the neural network, if you will. It is not open source because they didn’t release the material that it was trained on.

dependencyinjection@discuss.tchncs.de · 9 months ago

Thanks.

Are there any models that are truly open source where they have shown the datasets it was trained on?

Dadifer@lemmy.world · 9 months ago

Not that I know of

HappyFrog@lemmy.blahaj.zone · 9 months ago

There are probably several “mnist” or other smaller networks that are fully open sourced. But that’s not the kind of neural networks most are talking about.

Lemminary@lemmy.world · 9 months ago

It is not open source because they didn’t release the material that it was trained on.

I’m not sure if I’m missing a definition here but open source usually means that anyone can use the source code under some or no conditions.

Dadifer@lemmy.world · 9 months ago

You can’t use the source code, just the neural network the source code generated.

LostWon@lemmy.ca · edit-2 9 months ago

deleted by creator

Johanno@feddit.org · 9 months ago

Open source means bx definition that the code is open the usage is open and anybody can use it.

This includes in theory the training material for the model.

But in common language open source means: i can download it and it runs on my machine. Ignoring legal shit.

Spaceballstheusername@lemmy.world · 9 months ago

I’m pretty sure open source means that the source code is open to see. I’m pretty sure there is open source things that you need to pay to use.

Hawk@lemmynsfw.com · 9 months ago

An LLM is an equation, fundamentally. Map a word to a number, equation, map back to words and now llm. If you’re curious write a name generator using torch with an rnn (plenty of tutorials online) and you’ll have a good idea.

The parameters of the equation are referred to as weights. They release the weights but may not have released:

source code for training
there source code for inference / validation
training data
cleaning scripts
logs, git history, development notes etc.

Open source is typically more concerned with the open nature of the code base to foster community engagement and less on the price of the resulting software.

Curiously, open weighted LLM development has somewhat flipped this on its head. Where the resulting software is freely accessible and distributed, but the source code and material is less accessible.

Schadrach@lemmy.sdf.org · 9 months ago

In parallel to what Hawk wrote, AI image generation is similar. The idea is that through training you essentially produce an equation (really a bunch of weighted nodes, but functionally they boil down to a complicated equation) that can recognize a thing (say dogs), and can measure the likelihood any given image contains dogs.

If you run this equation backwards, it can take any image and show you how to make it look more like dogs. Do this for other categories of things. Now you ask for a dog lying in front of a doghouse chewing on a bone, it generates some white noise (think “snow” on an old TV) and ask the math to make it look maximally like a dog, doghouse, bone and chewing at the same time, possibly repeating a few times until the results don’t get much more dog, doghouse, bone or chewing on another pass, and that’s your generated image.

The reason they have trouble with things like hands is because we have pictures of all kinds of hands at all kinds of scales in all kinds of positions and the model doesn’t have actual hands to compare to, just thousands upon thousands of pictures that say they contain hands to try figure out what a hand even is from statistical analysis of examples.

LLMs do something similar, but with words. They have a huge number of examples of writing, many of them tagged with descriptors, and are essentially piecing together an equation for what language looks like from statistical analysis of examples. The technique used for LLMs will never be anything more than a sufficiently advanced Chinese Room, not without serious alterations. That however doesn’t mean it can’t be useful.

For example, one could hypothetically amass a bunch of anonymized medical imaging including confirmed diagnoses and a bunch of healthy imaging and train a machine learning model to identify signs of disease and put priority flags and notes about detected potential diseases on the images to help expedite treatment when needed. After it’s seen a few thousand times as many images as a real medical professional will see in their entire career it would even likely be more accurate than humans.

Monstrosity@lemm.ee · 9 months ago

Yeah. That is true.

kibiz0r@midwest.social · 9 months ago

I wouldn’t say it’s the biggest issue. Even if access was free, we’d still have to contend with the extreme energy use, and the epistemic chaos of being able to generate convincing bullshit much quicker than it can be detected and flagged.

I think it’s a harmful product in general. We’re polluting our infosphere the same way we polluted our ecosphere, and in both cases there’s still folks who think “unequal access to polluting industries” is the biggest problem.

Hackworth@lemmy.world · 9 months ago

All the data centers in the US combined use 4% of the electric load, and one of the main upsides to deepseek is that it requires much less energy to train (the main cost).

Monstrosity@lemm.ee · 9 months ago

You’re right about this. I was commenting in the context of “intellectual property”.

Hawk@lemmynsfw.com · 9 months ago

The energy use isn’t that extreme. A forward pass on a 7B can be achieved on a Mac book.

If it’s code and you RAG over some docs you could probably get away with a 4B tbh.

ML models use more energy than a simple model, however, not that much more.

The reason large companies are using so much energy is that they are using absolutely massive models to do everything so they can market a product. If individuals used the right model to solve the right problem (size, training, feed it with context etc. ) there would be no real issue.

It’s important we don’t conflate the excellent progress we’ve made with transformers over the last decade with an unregulated market, bad company practices and limited consumer Tech literacy.

TL;DR: LLM != search engine

uis@lemm.ee · 9 months ago

we’d still have to contend with the extreme energy use,

Meanwhile people running it on Raspberry PI: “I made it consume 1W less, which is 30% improvement!”

and the epistemic chaos of being able to generate convincing bullshit much quicker than it can be detected and flagged.

It’s been this way long before modern AI.

InFerNo@lemmy.ml · 9 months ago

The infosphere already turned to shit over 10 years ago when the internet started consolidating towards a few super large companies.

surph_ninja@lemmy.world · 9 months ago

I love that the anti-AI crowd is struggling to update their talking points after DeepSeek has made most of them irrelevant.

kibiz0r@midwest.social · 9 months ago

You’re welcome to scroll my comment history to about 6 months ago where I was using the phrase “informational equivalent of Kessler Syndrome”, or talking about accurate attribution and faithful replication as being useful effects of the current copyright regime even if the rest of it is garbage.

merc@sh.itjust.works · 9 months ago

Wow, you really have your head deep in something. I’m not sure if it’s deep in the sand, or deep up Sam Altman’s ass.

You think people have “talking points” against bullshit fountains, as if they’re needed? You think it’s a struggle to come up with reasons why they’re bad? The truth is that they’re absolutely useless at best, and actively harmful at worst. And no, this isn’t about the monstrous amounts of energy they use. It’s that they offer nothing of value.

“Gee whiz, thanks to this bullshit fountain I can research legal cases much more efficiently!” Oops, turns out the bullshit fountain just made up those precedents and now the judge is furious at me.

“Apple Intelligence will just summarize my texts for me!” Oops, the summaries were so wrong that they were actively harmful, and now Apple has been forced to turn off that feature.

“I’ll just have the LLM generate code for me!” Oops, now I have to spend a week debugging because the perfectly plausible code that was generated has a subtle logic error.

“Let me just search the web for something. Aha, I won’t be taken in by that AI summary at the top because I know that’s unreliable bullshit from the bullshit fountain. I’ll just scroll past that and click on the actual web pages.” Oops. Those actual web pages are now LLM-generated and filled with bullshit. I guess now I have to stop using the web and rely on printed encyclopedias from the time before LLMs to actually get verifiable facts. Winning!

surph_ninja@lemmy.world · edit-2 9 months ago

No, I’m a socialist. I’m against big tech and the rich in general. But I don’t let my bias cause me to latch onto every criticism blindly. Even now, we’re talking about DeepSeek, which is the furthest thing from pro-big tech and also much more energy efficient, but you’re still deflecting back to your generic and irrelevant criticisms to try to apply them to this.

Your examples give away that you don’t have significant experience with AI. You know about Altman claims that have been debunked publicly, but obviously not much personal experience that would give you more than a superficial understanding of the basics beyond that.

merc@sh.itjust.works · 9 months ago

you don’t have significant experience with AI

I have enough experience to know it’s utterly useless. If you keep looking for more experience once you’ve realized that, you’re in a cult.

surph_ninja@lemmy.world · 9 months ago

Cool dude. Keep your head buried in the sand, and continue refusing to keep up with new technology. Surely it won’t backfire.

merc@sh.itjust.works · 9 months ago

Ha! You think AI is “keeping up with technology”? It’s not, it’s a diversion. Technology continues to advance, but this sideshow has nothing to do with it. You keep on drinking that kool-aid. Surely there will be a use for the bullshit fountain, any day now!

Sorgan71@lemmy.world · edit-2 9 months ago

Artists use our collective knowlege and culture in the same way. Its just some of them are whiny and complain when ai does their job faster and cheaper.

Monstrosity@lemm.ee · edit-2 9 months ago

I am an artist & I agree, actually.

I do think it’s problematic that corpos are using AI to replace working artists, although that’s a systemic issue affecting a lot of disciplines.

That said, and I will get hate for this, there is a case to be made that if artists were more creative and interesting in general, they wouldn’t be so easily displaced by AI slop.

Sorgan71@lemmy.world · 9 months ago

Yeah I mean faster and cheaper does not mean more creative.

surph_ninja@lemmy.world · 9 months ago

Exactly. DeepSeek Mr Bean is shading the answers with everyone. Not hoarding them for his own gain.