ACE-Step: A step towards music generation foundation model

37 wertyk 17 5/6/2025, 8:38:00 PM github.com ↗

Comments (17)

999900000999 · 2h ago
> RapMachine

Fine-tuned on pure rap data to create an AI system specialized in rap generation Expected capabilities include AI rap battles and narrative expression through rap Rap has exceptional storytelling and expressive capabilities, offering extraordinary application potential

Using a certain other music generator I got it to accidentally say ***. It said it with a Latino American accent too.

In fact for whatever reason this tool couldn’t use a typical AAVE voice. Just Sage Francis / Atmosphere like dictionary raps and a few Latino American ones.

A big limitation of AI sloop is it tries to not offend anyone.

Art that can’t even try to offend is barely art.

Duwensatzaj · 1h ago
> Art that can’t even try to offend is barely art.

Good art is secondary to avoiding some journalist writing a hit piece about how they used your company’s AI generator to depict Hitler saying the gamer word.

lwansbrough · 43m ago
To be fair, nothing AI generated is inherently “good art” if you take artistic intent into account.
999900000999 · 1h ago
AI generated articles about ai generated music which is generated from reading ai generated articles.

Shall it continue in an unholy loop until the end of time ?

skybrian · 1h ago
Instrumental music doesn’t count as art? Come on.
999900000999 · 1h ago
Instrumental music can absolutely offend!

It can challenge the old standards, it can push genres into new places.

AI music can’t. It’s too safe.

PaulDavisThe1st · 1h ago
> Instrumental music can absolutely offend!

The Rite of Spring (Stravinsky) has entered the chat!

And if that's not offensive enough, Music in Similar Motion by Glass, or Metal Machine Music by Pat Metheny or any of Glenn Branca's "guitar symphonies" will likely do the job for most people.

resize2996 · 33m ago
Lou Reed - Metal Machine Music

Pat Metheny - Zero Tolerance for Silence

Rodeoclash · 2h ago
As a musician, the things I want most from generative AI is:

1. Being able to have the AI fill in a track in the song, but use the whole song as input to figure out what to generate. Ideally for drums this would be a combination of individual drum hits, effects and midi so I'm able to tweak it after generation. If it used the Ableton effects and drum rack then that would be perfect.

2. Take my singing and make it both sound great and like any combination of great singers (e.g. give me a bit of Taylor Swift combined with Cat Power)

I've had a play with the style transfer between singers (bullet point 2 above) but when I last tried it, it was garbage in / garbage out, and my singing is garbage.

What I don't want: To just generate a whole song. Adobe does this style of assistive AI well in the photo editing space but no one seems to have brought it to audio yet.

vunderba · 26m ago
Logic Pro comes the closest to this with the addition of virtual drummers. You can assign it to follow certain rhythmic timings by connecting it to a main instrument track, and by labeling sections of your song (bridge, chorus, etc) you can regenerate until you find something you're happy with.

It's a far cry from having a real drummer, but it works in a pinch.

pelagic_sky · 1h ago
Exactly. I don’t want AI to make me a song. I want it to switch up my baseline or chords or recommend a fill instrument.
lilcrise · 33m ago
VPS 4 Cores SSD Disk Space: 200 GB CPU cores: 4 RAM: 4 GB Ubuntu Server 22.04 can it run on vps server this small?
architectonic · 2h ago
How do the quality and prompt adherence compare to Suno v4?
dheera · 45m ago
The diagram is super vague. How are the lyrics encoded? What does the encoder look like inside? What is the input size, input format, output size, output format? Are the three encoder outputs added? Concatentated? When Mert and m-Hubert combine are they added? Multiplied? Subtracted? Concatenated?

I really wish people could make better diagrams.

hammock · 2h ago
Is there a demo hosted somewhere?
frankfrank13 · 3h ago
Fun to play with!