I could be wrong, but I think the use case here is mainly for non-artists in domains where the music is not particularly important.
For example, a podcaster/youtuber may want a short intro track. An entertainer or a marketer may want some generic or silly background music.
Does it have a use case for a producer/musician? Maybe. It might give them ideas for chord progressions, melodies, etc. But real music does that too, and much more effectively.
abdullahkhalids · 3h ago
OT: Has anyone tried the opposite - ask AI to listen to music and determine the notes or chords being played? Or watch someone playing an instrument and give a textual output of what notes/chords they are playing.
TuringNYC · 14m ago
I did this for my graduate capstone (https://www.deepjams.com/)
We extracted chord progressions from existing music you would upload and then riffed based on those chords. there are open source libraries for this.
magicmicah85 · 2h ago
I would love this! There's a song I like by a band that broke up in 2013 and I am transcribing it by watching a live performance they did and trying my best but realizing I'm trying to take a mandolin/guitar and put it to acoustic. Even just being able to do a similar rendition would be nice by telling the AI "hey, do a twist on this and give me the chords/tabs".
I use https://moises.ai/ multiple times a week for practicing / figuring out chords being played. For the notes (say in a guitar riff), I dont know if such a thing exists
abdullahkhalids · 44m ago
Being able to isolate instruments, if it works well, is already a pretty big achievement.
thepryz · 2h ago
There’s a ton. Haven’t used any personally. AnthemScore, ScoreCloud, Melody Scanner are just a few I found after a quick search.
abdullahkhalids · 2h ago
I think these are all using old machine learning techniques and not the modern transformer based architectures that underlie LLMs. These tools won't be able to match the abilities of an expert musician replicating a song by listening to a live recording of it. Check this video channel where they ask professional drummers to replicate a song after only one listen [1].
Reminds me of an example in a similar direction, where AI was used for audio processing to filter out everything except a person's voice. If I remember right, it was able to focus on different people in a crowded room. It might have been also for music, to pick out an instrument and listen to it, filtering out the rest of the band.
nemo1618 · 2h ago
I'm very interested in this too. We're beginning to see models that can deeply understand a single image, or an audio recording of human speech, but I haven't seen any models that can deeply understand music. I would love to see an AI system that can iteratively explore musical ideas the same way Claude Code can iterate on code.
ethan_smith · 1h ago
There are several tools that do this already - AnthemScore, Spleeter, CREPE, and even Google's AudioLM can transcribe music to MIDI with varying accuracy depending on instrument complexity and audio quality.
ElectricalTears · 1h ago
A while ago (maybe a year) I asked chatgpt to make a guitar tab from a song that had no available tabs and it worked surprisingly well.
neonnoodle · 2h ago
No, that would be useful, and as such AI is incapable of doing it.
KerrAvon · 2h ago
LLMs can do this well, though, and there are such. They weren't calling themselves AI when I last looked a couple of years ago, but I'll bet any of them looking for VC money have rebranded since.
two-sandwich · 2h ago
If you consider that Entertainment gives the viewer what they want, and art intends to challenge, none of what's created here is "art". It doesn't push boundaries, create new genres, or satisfy an uncomfortable curiosity.
The tech here is fantastic. I love that such things are possible now and they're an exciting frontier in creation.
It's very dystopian to feel that the robots are making generic human-music with indescribably lifeless properties. I'm not an artist, so I don't feel personally attacked. Much like image gen, this seems to be aimed at replacing the bare-minimum artist (visual or auditory) with a "fill in the blanks" entertainment piece.
moritzwarhier · 2h ago
> Entertainment gives the viewer what they want, and art intends to challenge
This is a fruitless and snobby dichotomy that was attempted so many times in human history, and it makes no sense.
There will always be art made for success and/or money, but drawing a line is futile.
Händel used to be a bit like a pop musician.
And intellectual snobbishness or noble ideas do not make art more valuable.
A kid singing Wonderwall can be art, too. As can be a depressed person recording experimental field sounds.
Feel free to call art bad, but assuming an obvious and clear separation between art and entertainment is the exact opposite of the spirit that enables people to make or appreciate art, in whatever form, culture or shape.
viccis · 1h ago
>Händel used to be a bit like a pop musician.
Handel was never a "bit like a pop musician." This fundamentally misunderstands how music during his time, mostly funded and enjoyed under religion and wealthy patronage contexts, was listened to. Mostly only the wealthy listened to his works, and those elite audiences were prone to viciously enforcing stylistic norms. The only real way the working class heard his works were in the occasional public concert and occasionally in church. At no point in any of these settings was there a lack of stylistic gatekeeping or snobbery.
I know this kind of nihilistic "everything is good, I guess, good doesn't even mean anything" attitude is popular in some spaces, but this lack of standards or gatekeeping in favor of a tasteless desire for increasing slop production regardless of quality is how we got poptimism and the current state of music. No longer is there any taste making, just taste production via algorithms.
Sometimes we need a bit of snobbery to separate the wheat from the chaff, and being a gatekeeping snob against AI music is what our current day and age needs more of!
kelseyfrog · 2h ago
Art is a framing device largely independent of the content. It's how we get Fountain[1], Piss Christ[2], Comedian[3], Mother![4], 4’33”[5], and Seedbed[6] to name a few among countless others. To claim that AI content is incapable of being framed as art is nonsense when we have example after example of the diversity of what art can be. Let's remember, bad art is still art.
1. Marcel Duchamp. 1917
2. Andres Serrano. 1987
3. Maurizio Cattelan. 2019
4. Darren Aronofsky. 2017
5. John Cage. 1952
6. Vito Acconci. 1972
fruitworks · 1h ago
If that was true, then the development of a new urinal factory would have the same impact on art as the development of a new AI art models.
The framing is dependent on the content
kelseyfrog · 1h ago
Framings require creators; they don't arise spontaneously. Someone could turn a urinal factory into art, but art doesn't validate itself. Belief alone in artistic essentialism doesn’t make it so.
fruitworks · 1h ago
That's such a great pile of bullcrap, you could frame it and hang it up in a museum!
The point I'm making is that a unique framing only results in a single piece of worthwhile conceptual art. You can't have an infinite factory of ducamp's fountain. What makes the piece worthwhile is that it was an original idea.
Conceptual art is different from decorational art in this sense. The AI music is a largely a homogenous synthesis of existing works. The AI "art" is decorational, not conceptual art. You could make an arrangement of AI art that is conceptual, but how many arrangements can you make that are actually worthwhile conceptually if AI art is generally homogenous?
It's like asking how many worthwhile works of conceptual art can you produce with a urinal factory that makes identical clones of the same urinal? 0 to 1.
And besides, it's not nessisarially true that all framings have creators. Nature is an example of a system that cultivates and curates a certain type of life without any rational process.
kelseyfrog · 42m ago
I don't disagree that being novel has it's place in framing art, and I still believe that a Fountain Factory could certainly be framed as art.
To your other point, sorry, but artistic/aesthetic essentialism hasn't been serious position for at least a hundred years.
As long as there is a perceiver, there is a frame.
The idea that nature is intrinsically beautiful is a frame. It's fine to hold that but it shouldn't be confused with not having a frame.
blargey · 1h ago
Art is communication.
That "generic" and "indescribably lifeless" feeling you get is because the only thing communicated by a model-and-prompt generation is the model identity and the prompt.
qgin · 1h ago
If your art can be replaced by a model that recycles what’s already been done, maybe you were just recycling what’s already been done too.
anigbrowl · 21m ago
How do you expect people to get good when AI is pushing them out of the entry-level stages where they were previously able to earn a modest living while developing their craft?
> oh now they won't have to do that boring mindless stuff like playing cover versions any more
That's how most musicians make their first $, doing covers or making something generic enough to be saleable as background music
sekai · 1h ago
> and art intends to challenge, none of what's created here is "art". It doesn't push boundaries, create new genres, or satisfy an uncomfortable curiosity.
Art is, above all, subjective.
> It's very dystopian to feel that the robots are making generic human-music with indescribably lifeless properties.
Painters said the same thing about the camera.
Photographers said the same thing about Photoshop.
fruitworks · 1h ago
This art is subjectively bad
ronsor · 2h ago
The fact that these models have so many people irritated like there's sand in their pants is enough proof that they're pushing boundaries and making some uncomfortable.
thefaux · 51m ago
Yes, because the people pushing the boundaries do not understand the value of the thing they are trying to commoditize. If they did, they wouldn't be trying to commoditize it. There is a pervasive attitude among technologists that they can improve things they don't understand through technological efficiency. They are wrong in this case and getting appropriate pushback.
Personally, music is sacred for me so making money is not a part of my process. I am not worried about job loss. But I am worried about the cultural malaise that emerges from the natural passivity of industrial scale consumerism.
fruitworks · 2h ago
the boundaries of unemployment perhaps
omnimus · 1h ago
boundaries of copyright
paxys · 2h ago
Tech is tech. What you create with the tech can be art.
bix6 · 2h ago
These feels different than experimenting with a new synth or something though. It’s just feeding a sentence to the model.
altruios · 2h ago
'Just' is a loaded word here.
In image gen: comfyUI gives a node-based workflow that gives a lot of room for 'creative' control, of mixing, and mathematically combining masks, filters, and prompts (and starting images / noise {at any node in that process}).
I would expect the same interface for audio to emerge for 'power users'.
kingstnap · 2h ago
I draw a very sharp line: curating the outputs and crafting the sentence is enough to make it art. If neither of those happen, it's just slop.
It's actually a bit like photography. A bunch of randomly taken pictures piled together is not art. It needs to be done with purpose and refinement.
Basically, in my own opinion, art ≠ a function of technical difficulty.
Art = Curation, Refinement, and Taste
fruitworks · 1h ago
Curation, refinement and taste is practicially worthless on its own. The technical difficulty of art is the investment that makes it worth considering.
So if you are right, then art will pretty much be worthless in the future. You can just iterate over the search space defined by "good taste" and produce an infinite amount of good art for no work.
kingstnap · 1h ago
Curation is not worthless. It's the exact opposite, in the abundance of stuff, it's extremely valuable.
Search is not free, and it can never be free. What happens when search gets easier and easier is that your demands for quality and curation will get higher until all time saved in search efficiency is spent on search breadth.
blargey · 1h ago
Disagree that curation and prompts adds artistry (dense intent reflected in the output) to AI generations.
"Curation" in AI can only surface the curator's local maxima among a tiny and arbitrary grab-bag of seed integers they checked among the space of 2^64 options; it's statistically skewed 99% towards the model's whims rather than anyone's unique intent or taste.
Prompt crafting is likewise terribly low fidelity since it's a constant battle with the model's idiosyncratic interpretation of the text, plus arbitrary perturbations that aren't actually correlated with the writer's supposed intent. And lord spare me the "high quality high resolution ultra detailed photorealistic trending on artstation" type prompts that amount to a zero-intent plea for "more gooder". And when pursuing artistry, using artist names / LORAs are a meta-abandonment of personal direction, abdicating artistic control and responsibility to a model's idea of another artist's idea of what should be done.
Fancier workflows generally only multiply this prompt-and-curate process across regions/iterations, so can't add much because they're multiplying a tiny fraction by a fixed factor.
kingstnap · 1h ago
I agree with you on the idea of prompts and seeds leaving much to be desired. So that's why I think more sophisticated steering is necessary.
The models' latent space is extremely powerful, but you get hamstrung into the text encoders whims when you do things through a prompt interface. In particular, you've hit exactly an issue I have with current LLMs in general in that they are locked into wors and concepts that others have defined (labelings of points in the latent space).
Wishy washy thinking: I'd be nice if there were some sort of Turing complete lambda calculus sort of way to prompt these models instead. Where you can define new terms, create expressions, and loops and recursion or something.
It would sort of be like how SVGs are "intent complete" and undeniably art, but instead of vector graphics, it is an SVG like model prompt.
CamperBob2 · 1h ago
"It's just pushing a button on the camera."
anigbrowl · 7m ago
This is a stupid argument, because even with the most automatic camera you have to point it at something and make a decision about what to frame in. AI music is more like buying a bunch of old unlabelled records from a bargain bin and then praising yourself whenever one of them turns out to be worth listening to.
TheAceOfHearts · 1m ago
Well, here's a few points of comparison:
Suno's version of Mandate of Heaven [0]. This is my baseline, it was generated with their v4 model and so far it has remained my favorite AI generated song. I regularly listen to this one track and it brings me joy. There's many places where I think it could be drastically improved, but none of the competitors have managed to surpass it nor have they provided tools to improve upon it. The pronunciation is a bit bad sometimes and it fails to hold notes as long as I wish, but overall it has gotten the closest to my vision.
Eleven Music's version of Mandate of Heaven [1]. They don't allow free accounts to export or share the full song so you can only try a small fragment. It has much crisper instruments and vocals, but it has terrible pacing issues and pronunciation. The track is 4 minutes long, but the singer is just rushing through the track at wildly unexpected speeds. I cannot even play the song after it finished generating, so I haven't even been able to listen to the whole thing, it just gets stuck when I press play. Maybe some kind of release-day bug. The only tool that Eleven Music gives you for refining and modifying sections is "Edit Style", which feels pretty limiting. But I can't even try it because the track won't play.
Producer.ai's version of Mandate of Heaven [2][3]. This one has slightly worse instruments than Eleven Music, but the vocals are a bit better than Suno v4. It also has severe timing issues. I tried asking it to generate the track without a vibe reference [2] and also with a vibe reference [3]. Both versions have terrible pacing issues; somehow the one with the vibe reference is particularly egregious, like it's trying to follow the input vibe but getting confused.
It feels like AI song generation is just in a really awkward place, where you don't get enough customization capabilities to really refine tracks into the perfect direction that you're imagining. You can get something sorta generic that sounds vaguely reasonable, but once you want to take more control you hit a wall.
If one is willing to bite the bullet, there's a paid program for generating high quality synthetic voices while maintaining fine-grained controls: Synthesizer V Studio 2. But I haven't been able to try it out because I'm cheap and there's no Linux support.
The ideal workflow I'm imagining would probably allow me to generate a few song variations as a starting point, while integrating into a tool like Synthesizer V Studio 2 so I can refine and iterate on the details. This makes a lot of sense too, because that's basically how we are using AI tools for programming: for anything serious you're generating some code and iterating on it or making tweaks for your specific program. I would like to specify which parts of the track are actually important to me, and which ones can be filled with sausage in reaction to my changes.
Overall, Eleven Music generates instruments that sounds nice, but the singing leaves a lot to be desired (n=1). Eleven Labs is doing a ton of great product work so I'm really excited for the direction they'll take this once they're able to iterate on it a few times. A very strong showing for an initial release.
Having a machine-learning algorithm crank out generic music seems like peak dystopia to me
shadowgovt · 2h ago
Is it more or less dystopia than an army of musicians trying to eke a living out of creating stuff like this (https://www.chosic.com/free-music/presentation/) to go behind your company's PowerPoint about how its Q3 woodchip sales didn't quite exceed expectations?
Maybe the fundamental issue is that this shouldn't compete with a human picking up a guitar and having fun with it, and the only reason it does is because we keep tying questions like "survival" to whether someone can make woodchip earnings reports less boring to read instead of trying some other way to be a community?
bigfishrunning · 52m ago
Why does a powerpoint about woodchip sales need music? Also, I don't see how anyone is making a living on the royalty-free music you linked, unless there's some business model I'm not understanding.
whoamii · 2h ago
Wait until someone auto tunes an AI generated song.
throwawayoldie · 2h ago
I would like to say this categorically now: if you're using AI to generate any kind of art, go die in a fire.
stillpointlab · 2h ago
I really hope we move on from these boil-the-ocean models. I want something more collaborative and even iterative.
I was having a conversation with a former bandmate. He was talking about a bunch of songs he is working on. He can play guitar, a bit of bass and can sing. That leaves drums. He wants a model where he can upload a demo and it either returns a stem for a drum track or just combines his demo with some drums.
Right now these models are more like slot machines than tools. If you have the money and the time/patience, perhaps you can do something with it. But I am looking forward to when we start getting collaborative, interactive and iterative models.
krat0sprakhar · 1h ago
Very well said. I'm in the same boat. I'd love AI to write down a drum groove or a drum fill based on my guitar riff.
Currently, all these AI tools generate the whole song which I'm not at all interested in given songwriting is so much fun
viccis · 1h ago
RIP session musicians if that ever comes to pass, which is one of the main ways to make money if you are a good drummer.
pacifika · 1h ago
I’d recommend GarageBand for this.
stillpointlab · 1h ago
I haven't used the virtual drummer feature of GarageBand recently, but my experience with it was pretty disappointing. The output sounds very midi or like the most basic loops.
I believe there is massive room for improvement over what is currently available.
However, my larger point isn't "I want to do this one particular thing" and rather: I wish the music model companies would divert some attention away from "prompt a complete song in one shot" and towards "provide tools to iteratively improve songs in collaboration with a musician/producer".
amohn9 · 48m ago
Suno can already do that
feoren · 2h ago
We imagined a utopian future where robots did our menial work so we were free to be creative. Instead we got a dystopian future where we do more and more menial work so our robots can poorly emulate creativity. It's not too late to turn it around, but that requires recognizing the humanity of 99.9% of people, and the 0.1% who own everything would rather create their own synthetic (subservient) humans than recognize the basic rights of the ones that already exist (and can make fun of them on Twitter).
kingstnap · 1h ago
It's only from a position of extreme arrogance that you can complain that machines have not yet done enough for you.
But it's the fun thing about being humans, I suppose. Our insatiable greed means we demand endlessly more.
anigbrowl · 16m ago
This comment doesn't engage with the critique at all, it's just reflexive moralization.
krapp · 1h ago
They said the same thing about automation when the Industrial Revolution began a century or so ago. That the common worker would be liberated from the drudgery of labor and be free for creative and intellectual pursuits. The people who protested were ridiculed as Luddites who simply feared technology and progress.
Of course, because automation serves the interests of capital (being created by, and invested in, by the capitalist class,) the end result was just that workers worked more, and more often, and got paid less, and the capitalist class captured the extra value. The Luddites were right about everything.
I don't know why people expect the automation of intellect and creativity to be any different. Working at a keyboard instead of on a factory floor doesn't exempt you from the incentives of capitalism.
hudo · 2h ago
AI is great, I can see it benefit so many industries, except music. There's something profoundly wrong with AI generated music.
thepryz · 2h ago
Maybe I’m a Luddite, but this seems like it will just lead to music becoming superficial and lacking intentionality, or dare I say it, soul.
crazydoggers · 2h ago
I think you just described pop music
thepryz · 2h ago
It’s easy and popular to hate on pop music, but even pop music has value and requires a certain skill to understand what resonates with people.
This is taking a monkeys on a typewriter approach to all music. Click a button, see what the monkeys made and then click another button to publish to Spotify while you figure out a way to either market the music or just game search and digital assistants by creating an artist with a similar or slightly misspelled name as someone popular. Rinse and repeat.
shadowgovt · 2h ago
The current solution for creating the kind of music this tool can back-fill is to go on a site like "Free music for presentations" and click line after line after line after line of 10-second samples hoping to find one that "vibes" with you.
If anything, this is a lateral move.
colechristensen · 2h ago
>music becoming superficial and lacking intentionality, or dare I say it, soul.
This phrase though could be plunked down at any point in the last hundred years and you'd find someone making it.
About autotune or electric guitars or rock or jazz or punk or disco or Philp Glass or Stravinsky... one could go on for a long time.
darth_avocado · 2h ago
A lot of criticism of AI in music seems to be around the lack of originality and it being generic slop. But unfortunately that’s true for pretty much the entire music industry. Handful of people write songs for all the artists, “artists” don’t create the songs as much as they perform them, most of the music isn’t created but rather sampled or is “inspired” by other music and pretty much most of the artists sound like other artists.
Yes there are smaller creators who are trying to make something net new, but unfortunately 99.9% of the small artists are also derivative and lack originality.
I see AI music as just continuation of the sad state of the industry at the moment. Hopefully it accelerates the demise of the industry as we know it and restarts the cycle of creation.
feoren · 2h ago
None of that is new, but there were ways for the genuinely new, inspired, and genius to actually shine through before. It was hard, but possible. Humanity is making decisions that make that even harder: AI music, Spotify's revenue model, etc. They're all to the benefit of cookie-cutter slop (AI or human-made) over creativity.
This wouldn't necessarily be a problem as long as people were still free to create on their own. But instead, everyone is forced to spend more hours in menial bullshit jobs for less and less (relative) pay just to survive. Give everyone enough resources to live at least a simple life, and both human creativity and AI creativity can blossom at the same time. But of course that means fewer yachts and hookers and drugs for the billionaires, so it is verboten.
pr337h4m · 2h ago
This could be repurposed for really good stylistic search for music (similar to how CLIP powered stylistic search for images on same.energy)
fidotron · 2h ago
What struck me more than the output is the absolute marketing talk of the prompting, so not only is the output kind of creepy but in order to get the best results you have to translate your intention into the worst corpo-speak imaginable.
RyanOD · 2h ago
I'm dreading the day I discover a new "band" I'm totally into only to discover it's entirely AI.
krat0sprakhar · 1h ago
Hope your favorite new band is not "The Velvet Sundown"
cschmidt · 2h ago
I worry how often that is happening already on Spotify.
drivers99 · 1h ago
At least I still have all my old CDs.
shadowgovt · 2h ago
Oh boy, do I ever have bad news about how pop music is currently created...
RyanOD · 18m ago
Increasingly I can't get into new music. For a while, I couldn't understand why as I've been a rabid musician / music lover my entire life. Even music I thought was cool, I couldn't really get into. I always preferred the music I grew up with.
Eventually, the reason why became obvious. I grew up listening to all that music with my closest friends. It's the memories I associate with that music that keeps me coming back. I moved away 30 years ago and never established friendships like that again. New music feels hollow to me because I don't have buddies to share it with and build associated memories.
footy · 47m ago
this would only be bad news for that commenter if they're into pop music
sys32768 · 2h ago
I think AI will spark a revival in "organic" human culture and art, with people flocking to see real musicians play real instruments, and real artists using real materials.
It will probably also extinguish quite a few mad musicians and mediocre artists.
throwaway1280 · 2h ago
This is an incredible achievement, but as a musician, I wish this would go die in a fire.
I'm a bedroom hobby musician with no dreams of ever making it big, but even so, I'm looking at the hours I'm spending trying to improve my skills and thinking what's the point, really, when I could just type in 'heavy metal guitar solo at 160bpm, A minor' and get something much much better?
I know there is value in creating art for art's sake. I've always been up against a sea of internet musicians, even when I started back in 2000. But there's just something about this that's much more depressing, when it's not even other people competing with me, but a machine which hasn't had to invest years of its life in practice to beat me.
rootforce · 1h ago
I can't see the future, but I imagine that the human art community may actually get more vibrant when divorced from being a way to make a living. Perhaps a return to something like a patronage system for the exceptional artists.
Open mics, music circles and concerts also remain untouched for the moment.
freedomben · 1h ago
As a fellow (hobbyist) musician, I feel you, but after doing a lot of introspection I realized that it's the art (and the process) that I really like, not (just) the end result (though that is of course a rewarding aspect). For example, jamming out to a kickin' song is fun, even if I'm just covering something. I also realized that my own ability to produce things isn't affected by this (as long as you don't want to make money on it). As someone who loves to play bass but is generally bad at writing bass riffs, I also see some fun potential to use AI to get bass tracks that go along with my main guitar riffs. I can always throw them out and rewrite from scratch later, or just iterate on them to get them where I like them. I do think I'll feel a bit lof a loss of "artistic purity" with doing something like that, but the more I think about it, the biggest reason that might bother me is because I'd feel judged by other musicians :-D
shadowgovt · 1h ago
When Harmonix was in the concept phase for Guitar Hero, there were two slides in the presentation. The first was the "con" slide: novel game style risks not finding a market, technical challenges of the new style of gameplay, requiring peripherals for a game is a famous pathway to low-volume sales.
The next slide was labeled "pro," and it was just a picture of Jimi Hendrix on-stage mid-performance.
I'd submit to you the notion that even if the machine can create a billion billion iterations of music, it still cannot create what you will create, for the reasons you will create it, and that's reason enough to continue. Hendrix wasn't just "a guy who played guitar good." And a machine that could word-for-word and bar-for-bar synthesize "Foxy Lady" wouldn't be Hendrix.
Hendrix, also, can't be you. Nor you him.
thefaux · 1h ago
It is embarrassing how much time I spent playing that dumb game instead of actually practicing a more versatile instrument.
thefaux · 1h ago
I would suggest that instead of feeling demoralized by ai that you instead ask what can you offer musically that an ai cannot. I also would suggest trying to let go of the notion that music is primarily about achieving a certain level of technical proficiency. There are no limits on your growth musically unless you artificially constrain them because you are deluded by a technology to believe that you don't already have what you need inside of you.
Do you regularly play with other people? That is a good way to disabuse yourself of the notion that all that matters is technique.
yanis_t · 2h ago
Did anyone compared
to Suno quality wise? Seems like one benefit being the API availability
TrackerFF · 59m ago
I'm a musician
I haven't used the elevnlabs one, but I've checked out suno and udio, and to be honest the tech is amazing. But like with a lot of genai images, the current music models have the same smell to it.
These models can def be used to crank out commercially sounding music, but I have yet to hear any output that sounds fresh and exciting. If your goal is to create bro country, these models are a god-send.
With that said, I do believe that musicians will start to create music with these tools as aid. I've tried to use them for generating samples and ideas, and they do work well enough.
MarcelOlsz · 54m ago
>With that said, I do believe that musicians will start to create music with these tools as aid.
The more tech advances the cooler it is to bury your head in the sand. Studying by paper & candlelight has never been more of a flex.
Unless you're talking about EDM people and those adjacent. Not that they're not "real musicians" but they're much more about tech and gadgets so I can see them using it more.
anigbrowl · 12m ago
I'm an electronic musician and AI music is banned on my favorite forum. Someone popped up a few months back challenging everyone to listen to their AI-generated music album and ended up being flamed to a crisp - not because people didn't engage with it, but because they did and pointed out in detail what was so bad about it.
TrackerFF · 33m ago
I'm a "real" musician, and have done session / studio work, as well as made a living by playing live music in my younger days, before getting a "real job" in tech.
While there will no doubt be many that feel they're above using tools like these, the reality is that if you want to make money out of music - you're going to make music for the masses.
And if there's one thing these models really excel at, it is to make commercial sounding music. Everything sounds nice and bland.
wturner · 2h ago
Write a song about the billionaire social engineer Peter Thiel getting robbed and murdered by a broke minimum wage worker with no healthcare.
"That is not allowed by our terms of service"
I think the rebellious nature of art inherently has boundaries these people won't cross.
zubzubi · 1h ago
Can it do differently-styled covers of songs or improvisations upon melodies like Suno can?
As a musician, that's what I find most compelling about Suno. It's become a tool to collaborate with, to help test out musical ideas and inspire new creativity. I listen to its output and then take the parts I like to weave into my own creations.
The AI music tools that generate whole songs out of prompts are a curious gimmick, but are sorely lacking in the above.
asadm · 2h ago
Do we yet have a way to "autocomplete" music from existing music? or from humming to music (music-to-music?)?
I am less interested in the "one-shot" approach here with text-to-prompt. I see seamless transitions but that seems like an afterthought.
recursive · 2h ago
> Studio-grade
The vocals are definitely not that.
artninja1988 · 2h ago
The context length seems super small. The prompt needs to be shorter than 2k characters, which is super limiting. Hope they address this soon.
Oceoss · 2h ago
I've been waiting for this AI development for a while
It's amazing that the songs sound pretty natural
throwaway_32u10 · 2h ago
Behold, we are one step closer to a world where "AI will do the mundane tasks, and the we all are going to engage in creative hobbies such as music creation"... Oh... wait...
bigbuppo · 2h ago
Well, they've done it boys, they've made creative fulfillment obsolete. They've DISRUPTED the concept of going to big music festivals, and small cozy shows. Just plug your ear holes with the AI slop bucket's pure beeps and boops and never have to worry again about paying artists for music. You can pay a techbro instead.
This is like the dotcom era of where every idiotic idea that ended with, "but on the internet", would get a pile of cash thrown at it. We are officially at the beginning of the end. It's only going to get dumber from here.
shadowgovt · 2h ago
There isn't any relationship whatsoever between a concert or music festival and this tech. Two entirely different experiences.
iamsaitam · 2h ago
The silver lining to any of these models is that what AI generates is total shite.
everfrustrated · 2h ago
Finally non-repeating muzak is here
jp1016 · 2h ago
lot of ai progress today, open ai, claude and now eleven labs
mixologic · 2h ago
Well, so much for culture. Every use case where you can plausibly use AI generated music removes one more method that provided an avenue to a reasonable career making music.
For example, a podcaster/youtuber may want a short intro track. An entertainer or a marketer may want some generic or silly background music.
Does it have a use case for a producer/musician? Maybe. It might give them ideas for chord progressions, melodies, etc. But real music does that too, and much more effectively.
And then I found this live version here that I'm studying: https://www.youtube.com/watch?v=pPQZsp59szo
[1] https://www.youtube.com/results?search_query=drummer+replica...
The tech here is fantastic. I love that such things are possible now and they're an exciting frontier in creation.
It's very dystopian to feel that the robots are making generic human-music with indescribably lifeless properties. I'm not an artist, so I don't feel personally attacked. Much like image gen, this seems to be aimed at replacing the bare-minimum artist (visual or auditory) with a "fill in the blanks" entertainment piece.
This is a fruitless and snobby dichotomy that was attempted so many times in human history, and it makes no sense.
There will always be art made for success and/or money, but drawing a line is futile.
Händel used to be a bit like a pop musician.
And intellectual snobbishness or noble ideas do not make art more valuable.
A kid singing Wonderwall can be art, too. As can be a depressed person recording experimental field sounds.
Feel free to call art bad, but assuming an obvious and clear separation between art and entertainment is the exact opposite of the spirit that enables people to make or appreciate art, in whatever form, culture or shape.
Handel was never a "bit like a pop musician." This fundamentally misunderstands how music during his time, mostly funded and enjoyed under religion and wealthy patronage contexts, was listened to. Mostly only the wealthy listened to his works, and those elite audiences were prone to viciously enforcing stylistic norms. The only real way the working class heard his works were in the occasional public concert and occasionally in church. At no point in any of these settings was there a lack of stylistic gatekeeping or snobbery.
I know this kind of nihilistic "everything is good, I guess, good doesn't even mean anything" attitude is popular in some spaces, but this lack of standards or gatekeeping in favor of a tasteless desire for increasing slop production regardless of quality is how we got poptimism and the current state of music. No longer is there any taste making, just taste production via algorithms.
Sometimes we need a bit of snobbery to separate the wheat from the chaff, and being a gatekeeping snob against AI music is what our current day and age needs more of!
1. Marcel Duchamp. 1917
2. Andres Serrano. 1987
3. Maurizio Cattelan. 2019
4. Darren Aronofsky. 2017
5. John Cage. 1952
6. Vito Acconci. 1972
The framing is dependent on the content
The point I'm making is that a unique framing only results in a single piece of worthwhile conceptual art. You can't have an infinite factory of ducamp's fountain. What makes the piece worthwhile is that it was an original idea.
Conceptual art is different from decorational art in this sense. The AI music is a largely a homogenous synthesis of existing works. The AI "art" is decorational, not conceptual art. You could make an arrangement of AI art that is conceptual, but how many arrangements can you make that are actually worthwhile conceptually if AI art is generally homogenous?
It's like asking how many worthwhile works of conceptual art can you produce with a urinal factory that makes identical clones of the same urinal? 0 to 1.
And besides, it's not nessisarially true that all framings have creators. Nature is an example of a system that cultivates and curates a certain type of life without any rational process.
To your other point, sorry, but artistic/aesthetic essentialism hasn't been serious position for at least a hundred years.
As long as there is a perceiver, there is a frame.
The idea that nature is intrinsically beautiful is a frame. It's fine to hold that but it shouldn't be confused with not having a frame.
That "generic" and "indescribably lifeless" feeling you get is because the only thing communicated by a model-and-prompt generation is the model identity and the prompt.
> oh now they won't have to do that boring mindless stuff like playing cover versions any more
That's how most musicians make their first $, doing covers or making something generic enough to be saleable as background music
Art is, above all, subjective.
> It's very dystopian to feel that the robots are making generic human-music with indescribably lifeless properties.
Painters said the same thing about the camera. Photographers said the same thing about Photoshop.
Personally, music is sacred for me so making money is not a part of my process. I am not worried about job loss. But I am worried about the cultural malaise that emerges from the natural passivity of industrial scale consumerism.
In image gen: comfyUI gives a node-based workflow that gives a lot of room for 'creative' control, of mixing, and mathematically combining masks, filters, and prompts (and starting images / noise {at any node in that process}).
I would expect the same interface for audio to emerge for 'power users'.
It's actually a bit like photography. A bunch of randomly taken pictures piled together is not art. It needs to be done with purpose and refinement.
Basically, in my own opinion, art ≠ a function of technical difficulty.
Art = Curation, Refinement, and Taste
So if you are right, then art will pretty much be worthless in the future. You can just iterate over the search space defined by "good taste" and produce an infinite amount of good art for no work.
Search is not free, and it can never be free. What happens when search gets easier and easier is that your demands for quality and curation will get higher until all time saved in search efficiency is spent on search breadth.
"Curation" in AI can only surface the curator's local maxima among a tiny and arbitrary grab-bag of seed integers they checked among the space of 2^64 options; it's statistically skewed 99% towards the model's whims rather than anyone's unique intent or taste.
Prompt crafting is likewise terribly low fidelity since it's a constant battle with the model's idiosyncratic interpretation of the text, plus arbitrary perturbations that aren't actually correlated with the writer's supposed intent. And lord spare me the "high quality high resolution ultra detailed photorealistic trending on artstation" type prompts that amount to a zero-intent plea for "more gooder". And when pursuing artistry, using artist names / LORAs are a meta-abandonment of personal direction, abdicating artistic control and responsibility to a model's idea of another artist's idea of what should be done.
Fancier workflows generally only multiply this prompt-and-curate process across regions/iterations, so can't add much because they're multiplying a tiny fraction by a fixed factor.
The models' latent space is extremely powerful, but you get hamstrung into the text encoders whims when you do things through a prompt interface. In particular, you've hit exactly an issue I have with current LLMs in general in that they are locked into wors and concepts that others have defined (labelings of points in the latent space).
Wishy washy thinking: I'd be nice if there were some sort of Turing complete lambda calculus sort of way to prompt these models instead. Where you can define new terms, create expressions, and loops and recursion or something.
It would sort of be like how SVGs are "intent complete" and undeniably art, but instead of vector graphics, it is an SVG like model prompt.
Suno's version of Mandate of Heaven [0]. This is my baseline, it was generated with their v4 model and so far it has remained my favorite AI generated song. I regularly listen to this one track and it brings me joy. There's many places where I think it could be drastically improved, but none of the competitors have managed to surpass it nor have they provided tools to improve upon it. The pronunciation is a bit bad sometimes and it fails to hold notes as long as I wish, but overall it has gotten the closest to my vision.
Eleven Music's version of Mandate of Heaven [1]. They don't allow free accounts to export or share the full song so you can only try a small fragment. It has much crisper instruments and vocals, but it has terrible pacing issues and pronunciation. The track is 4 minutes long, but the singer is just rushing through the track at wildly unexpected speeds. I cannot even play the song after it finished generating, so I haven't even been able to listen to the whole thing, it just gets stuck when I press play. Maybe some kind of release-day bug. The only tool that Eleven Music gives you for refining and modifying sections is "Edit Style", which feels pretty limiting. But I can't even try it because the track won't play.
Producer.ai's version of Mandate of Heaven [2][3]. This one has slightly worse instruments than Eleven Music, but the vocals are a bit better than Suno v4. It also has severe timing issues. I tried asking it to generate the track without a vibe reference [2] and also with a vibe reference [3]. Both versions have terrible pacing issues; somehow the one with the vibe reference is particularly egregious, like it's trying to follow the input vibe but getting confused.
It feels like AI song generation is just in a really awkward place, where you don't get enough customization capabilities to really refine tracks into the perfect direction that you're imagining. You can get something sorta generic that sounds vaguely reasonable, but once you want to take more control you hit a wall.
If one is willing to bite the bullet, there's a paid program for generating high quality synthetic voices while maintaining fine-grained controls: Synthesizer V Studio 2. But I haven't been able to try it out because I'm cheap and there's no Linux support.
The ideal workflow I'm imagining would probably allow me to generate a few song variations as a starting point, while integrating into a tool like Synthesizer V Studio 2 so I can refine and iterate on the details. This makes a lot of sense too, because that's basically how we are using AI tools for programming: for anything serious you're generating some code and iterating on it or making tweaks for your specific program. I would like to specify which parts of the track are actually important to me, and which ones can be filled with sausage in reaction to my changes.
Overall, Eleven Music generates instruments that sounds nice, but the singing leaves a lot to be desired (n=1). Eleven Labs is doing a ton of great product work so I'm really excited for the direction they'll take this once they're able to iterate on it a few times. A very strong showing for an initial release.
[0] https://suno.com/s/HfDUqRp0ca2gwwAx
[1] https://elevenlabs.io/music/songs/TGyOFpwJsHdS3MTiHFUP
[2] https://www.producer.ai/song/aa1f3cc4-f3e4-40ce-9832-47dc300...
[3] https://www.producer.ai/song/3d02dd17-69f1-41ba-a3ea-967902f...
Maybe the fundamental issue is that this shouldn't compete with a human picking up a guitar and having fun with it, and the only reason it does is because we keep tying questions like "survival" to whether someone can make woodchip earnings reports less boring to read instead of trying some other way to be a community?
I was having a conversation with a former bandmate. He was talking about a bunch of songs he is working on. He can play guitar, a bit of bass and can sing. That leaves drums. He wants a model where he can upload a demo and it either returns a stem for a drum track or just combines his demo with some drums.
Right now these models are more like slot machines than tools. If you have the money and the time/patience, perhaps you can do something with it. But I am looking forward to when we start getting collaborative, interactive and iterative models.
Currently, all these AI tools generate the whole song which I'm not at all interested in given songwriting is so much fun
I believe there is massive room for improvement over what is currently available.
However, my larger point isn't "I want to do this one particular thing" and rather: I wish the music model companies would divert some attention away from "prompt a complete song in one shot" and towards "provide tools to iteratively improve songs in collaboration with a musician/producer".
But it's the fun thing about being humans, I suppose. Our insatiable greed means we demand endlessly more.
Of course, because automation serves the interests of capital (being created by, and invested in, by the capitalist class,) the end result was just that workers worked more, and more often, and got paid less, and the capitalist class captured the extra value. The Luddites were right about everything.
I don't know why people expect the automation of intellect and creativity to be any different. Working at a keyboard instead of on a factory floor doesn't exempt you from the incentives of capitalism.
This is taking a monkeys on a typewriter approach to all music. Click a button, see what the monkeys made and then click another button to publish to Spotify while you figure out a way to either market the music or just game search and digital assistants by creating an artist with a similar or slightly misspelled name as someone popular. Rinse and repeat.
If anything, this is a lateral move.
This phrase though could be plunked down at any point in the last hundred years and you'd find someone making it.
About autotune or electric guitars or rock or jazz or punk or disco or Philp Glass or Stravinsky... one could go on for a long time.
Yes there are smaller creators who are trying to make something net new, but unfortunately 99.9% of the small artists are also derivative and lack originality.
I see AI music as just continuation of the sad state of the industry at the moment. Hopefully it accelerates the demise of the industry as we know it and restarts the cycle of creation.
This wouldn't necessarily be a problem as long as people were still free to create on their own. But instead, everyone is forced to spend more hours in menial bullshit jobs for less and less (relative) pay just to survive. Give everyone enough resources to live at least a simple life, and both human creativity and AI creativity can blossom at the same time. But of course that means fewer yachts and hookers and drugs for the billionaires, so it is verboten.
Eventually, the reason why became obvious. I grew up listening to all that music with my closest friends. It's the memories I associate with that music that keeps me coming back. I moved away 30 years ago and never established friendships like that again. New music feels hollow to me because I don't have buddies to share it with and build associated memories.
It will probably also extinguish quite a few mad musicians and mediocre artists.
I'm a bedroom hobby musician with no dreams of ever making it big, but even so, I'm looking at the hours I'm spending trying to improve my skills and thinking what's the point, really, when I could just type in 'heavy metal guitar solo at 160bpm, A minor' and get something much much better?
I know there is value in creating art for art's sake. I've always been up against a sea of internet musicians, even when I started back in 2000. But there's just something about this that's much more depressing, when it's not even other people competing with me, but a machine which hasn't had to invest years of its life in practice to beat me.
Open mics, music circles and concerts also remain untouched for the moment.
The next slide was labeled "pro," and it was just a picture of Jimi Hendrix on-stage mid-performance.
I'd submit to you the notion that even if the machine can create a billion billion iterations of music, it still cannot create what you will create, for the reasons you will create it, and that's reason enough to continue. Hendrix wasn't just "a guy who played guitar good." And a machine that could word-for-word and bar-for-bar synthesize "Foxy Lady" wouldn't be Hendrix.
Hendrix, also, can't be you. Nor you him.
Do you regularly play with other people? That is a good way to disabuse yourself of the notion that all that matters is technique.
I haven't used the elevnlabs one, but I've checked out suno and udio, and to be honest the tech is amazing. But like with a lot of genai images, the current music models have the same smell to it.
These models can def be used to crank out commercially sounding music, but I have yet to hear any output that sounds fresh and exciting. If your goal is to create bro country, these models are a god-send.
With that said, I do believe that musicians will start to create music with these tools as aid. I've tried to use them for generating samples and ideas, and they do work well enough.
The more tech advances the cooler it is to bury your head in the sand. Studying by paper & candlelight has never been more of a flex.
Unless you're talking about EDM people and those adjacent. Not that they're not "real musicians" but they're much more about tech and gadgets so I can see them using it more.
While there will no doubt be many that feel they're above using tools like these, the reality is that if you want to make money out of music - you're going to make music for the masses.
And if there's one thing these models really excel at, it is to make commercial sounding music. Everything sounds nice and bland.
"That is not allowed by our terms of service"
I think the rebellious nature of art inherently has boundaries these people won't cross.
As a musician, that's what I find most compelling about Suno. It's become a tool to collaborate with, to help test out musical ideas and inspire new creativity. I listen to its output and then take the parts I like to weave into my own creations.
The AI music tools that generate whole songs out of prompts are a curious gimmick, but are sorely lacking in the above.
I am less interested in the "one-shot" approach here with text-to-prompt. I see seamless transitions but that seems like an afterthought.
The vocals are definitely not that.
It's amazing that the songs sound pretty natural
This is like the dotcom era of where every idiotic idea that ended with, "but on the internet", would get a pile of cash thrown at it. We are officially at the beginning of the end. It's only going to get dumber from here.
Oh wait.