How do axolotls regenerate limbs and organs? A researcher is cracking the code (news.northeastern.edu)

A maintain my own digital music collection. The only two tools I use for maintaining the CD portion of my collection are k3b and MusicBrainz Picard. k3b can rip to flac and it will on embed metadata present on the CD itself. Then after I rip it, I add it to Picard.

I use the "lookup CD" feature in Picard, which gives me a selection of releases to choose from. Among the choices, I usually see a release matching the catalog number on my CD's case. When I don't see a matching release, I will typically add the disc ID to an existing release, or I will create a new release, or sometimes even creating a new release + new release group and add the necessary metadata to MusicBrainz.

I haven't tried any automatic tagging process like the ripping program the article talks about does, mostly because I want to use Picard to make sure the metadata is correct or contribute to MusicBrainz if it isn't.

I like MusicBrainz a lot because applications like Plex use it very well to group release groups together and will (usually) deduplicate identical recordings so that identical tracks can share a rating. It's a really great database and is kept up to date pretty well.

CharlesW · 18h ago

MusicBrainz Picard is wonderful, but has one of the most unintuitive "first contact" experiences I can remember. If you're not sure how to get started, try this:

• Drag your album folders (one at a time so it doesn't get confused) into the pane that initially shows "Unclustered Files (0)" and "Clusters (0)".

• Select the "Clusters" folder in that pane and click "Lookup". This will find any close matches, and in my experience works ~25% of the time.

• For albums that weren't auto-matched, right-click the album folder name and choose "Search for similar albums…". As long as you're sorting by "Score", often you'll find a reasonably-good match in the top 5 options.

• NEVER use "Scan", basically.

For matched albums, carefully review things like album covers, titles, etc. before you "Save" the updated metadata. After using it to rebuild my personal music library, including ~200 contributions to the MusicBrainz database, I still haven't cracked (for example) how to stop Picard from defaultly replacing a perfect, 1500px album cover with a less-good, 1000px cover from its database.

sumtechguy · 1h ago

For cover art you can control it from options->options->cover art. There are also a couple of plugins for other sources.

There are a few items in there to control if it scans external or overwrite. Recently went thru this as apparently for some reason I had totally disabled it. Think I was trying to speed up scanning as it would download every artwork for a large group into the temp folder. I usually force it to make an external file. I pick what it suggested 'cover'. Then use something like fileoptimizer to recompress the jpg/png it comes up with. I do that because I like to embed the images. And much of what is out on the net is optimized for fast editing not 'archive'. I use mp3tag to put it back into the tag.

Scan is hit or miss. I have fed it whole albums and it will somehow find 3 other albums with some of the songs from that one. That could be because of how I have options->options->metadata->Prefered Releases set. That slider bar thing for some reason I can not wrap my head around. It is good for when you come across one of those items where someone else tagged it as 'weird al' (everything is weird al if it is funny). I have been slowly getting rid of that stuff but want to find the original album to buy. Musicbrainz can be good for that sort of thing. I have also had decent luck with it if I pre-add the albums then scan. It seems to find things better.

lloeki · 7h ago

> MusicBrainz Picard is wonderful, but has one of the most unintuitive "first contact" experiences I can remember.

Seconded, it's the best specialised UI I've seen in a while.

By "specialised" I mean it's entirely bespoke to a specific task and no other, with a small amount of dedicated jargon, like those industrial control panels full of buttons, toggles, and blinkenlights.

At first it's completely alien and appears to do weird stuff, possibly counterintuitive even (the mentioned "Scan" usage†, "what are clusters?", "why do I even need to cluster first?", "how do I save changes?")

But once you get the hang of it it's incredibly efficient with a ton of small niceties, like dragging a selection of entries from the left side will apply whatever candidates you have on the right side to the selection in order starting from the first.

† I use scanning only when album matching fails for whatever reason, it does sometimes unearth entries that wouldn't appear otherwise.

No comments yet

fsckboy · 11h ago

>NEVER use "Scan", basically

never use "scan" because it will never work? or because it is somehow destructive and will mess up your "cataloging"?

CharlesW · 10h ago

I have no doubt that it sometimes works, and would be happy to accept a verdict of "skill issue" if the problem is me.

Scanning a Cluster should (IMO) cause Picard to generate a series of AcoustID fingerprints/IDs from the tracks, then use that series to identify the best match (with extra points for handling missing tracks, etc.). But especially in the case of collections/compilations, the end result often resembles a transporter accident. Thankfully it's non-destructive, so it's straightforward to "Remove" all of the tracks you dragged in earlier along with the various albums that MusicBrainz created during the discombobulation process.

To be clear, my overall opinion of MusicBrainz and MusicBrainz Picard is that they are unappreciated triumphs. It would be nice if Wikipedia and Internet Archive diverted 0.01% of their fundraising to them. Google is the primary hero in their story, supporting them with over $500K so far. https://metabrainz.org/sponsors

prmoustache · 1h ago

What does "maintain" means in that context? Once you have ripped your cd and stored it somewhere there is nothing to maintain afaik (well appart than having backups if you don't want to ever rip them again).

mayneack · 20h ago

Yeah, imo using musicbrainz/picard is great for the process of bringing something into your collection. I encounter errors like others here have mentioned, but they're straightforward to fix. Importantly, it sets up a reference to an evolving update process so changes down the line can get back to my files cleanly.

commotionfever · 19h ago

since you mention Picard and wanting contribute to MusicBrainz. I'm working on a new fast tagger[1] in the spirit of Picard or beets. Just a little different and more scriptable

It makes it's best attempt to match with MusicBrainz, but if there's no match it it offers links to pre-seed MusicBrainz with tools like Harmony

https://github.com/sentriz/wrtag

CharlesW · 18h ago

Harmony (https://harmony.pulsewidth.org.uk/) is amazing, and completely changed my relationship with MusicBrainz.

What are you using for tag reading/writing in Go? Robust, complete options are non-existent in JavaScript land (Deno, Bun, Node, etc.), so I ended up creating a Wasm version of TagLib with a TypeScript API.

commotionfever · 17h ago

haha that's funny! I made a WASM TagLib for Go

https://github.com/sentriz/go-taglib

CharlesW · 16h ago

Cooool, I love that you arrived at the same conclusion! Mine's not ready for its ShowHN, but as an enthusiast, I'm super-excited to dig into yours. Very nice work!

riedel · 11h ago

I recommend also AudioRanger for resorting and moving stuff into the right places. For ripping I use ExactAudioCopy, which supports also flac.

mikepavone · 14h ago

This is a small point, but calling the 33-byte unit a sector in CDDA is a bit misleading and probably incorrect for the quantity being labeled. This is a channel data frame and contains 24-bytes of audio data, 1 byte of subcode data (except for the channel data frames that have sync symbols instead) and the rest is error correction. This is the smallest grouping of data in CDDA, but it's not really an individually addressable unit.

98 of these channel data frames make up a timecode frame which represents 1/75th of a second of audio and has 2352 audio data bytes, 96 subcode bytes (2 frames have sync codes instead) with the remainder being sync and error correction. Timecode frames are addressable (via the timecodes embedded in the subcode data) and are the unit referred to in the TOC. This is probably what's being called a sector here. Notably, a CD-ROM sector corresponds 1:1 with a timecode frame.

Note: Red book actually just confusingly calls both of these things frames and does not use the terms "channel data frame" or "timecode frame"

Lammy · 20h ago

I used to do the MusicBrainz thing with Picard and later with Beets, but I got sick of Somebody Else's Metadata because of MusicBrainz's (former?) policy where everything must be Title Cased regardless of how it's presented on the CD sleeve. I prefer my tags to match the artist's choice, because I consider it a tonal indicator that helps set the mood for the work.

It seems like they might not enforce that any more since the album I was going to pick on as an example is now tagged like I have it, although I also have lower-case “my bloody valentine” Artist tags on every track with Title Cased “My Bloody Valentime” Album-Artist tag for browsing in Navidrome: https://musicbrainz.org/release/1e4c282b-8b0d-4d20-9f74-175f...

…but I already got out of the habit and will still just keep typing them out myself :)

I also always include the catalog number in the Comment field and in brackets in my folder names to separate different releases of supposedly the same thing. Good example of why you would want to do this is the 2004 vs the 2007 releases of MM..FOOD? where the last track (Kookies) had to be redone to remove the Sesame Street samples:

- 2004: https://www.youtube.com/watch?v=Ci_XcL4nYos

- 2007: https://www.youtube.com/watch?v=8iYSwvdEfeY

Shout-out to https://covers.musichoarders.xyz/ and https://fanart.tv/ for high-quality album art to embed.

qingcharles · 15h ago

As someone responsible for setting the track naming policy originally at a big streaming company, I can't remember what the policy was. I know I would be called in all the time for crazy shit like Aphex Twin having just a page of equations as track names, or I seem to remember some album by Röyksopp that had just colors printed for the track names and no words. That stuff killed me.

Or the team doing all the ingestion being overworked minimum wage high school grads and suddenly an entire semi truck turns up and it's just palettes of CDs completely in many various East Asian languages.

If I had to do it over I would have two fields, one for whatever best represented what the CD says (and as someone below me points out, this was usually the publisher's artistic discretion and differed between the data they sent, the back of the CD, the track list printed on the CD and the liner notes) and I would have a separate field for Title Cased Titles.

tom_ · 12h ago

Aphex Twin's Selected Ambient Works 2 had a track listing that was 6 pie charts, one image per slice, 1 slice per track. I assume it came on 3 LPs, but I had the 2 CD version, and there were corresponding pie charts printed on the face up sides of the CDs... as if it made it any clearer.

I ripped them about 15 years ago and cddb came up with track names for them, matching the ones in its Wikipedia page (https://en.wikipedia.org/wiki/Selected_Ambient_Works_Volume_...). I wonder if we have any evidence that the mapping from tracks to images is remotely correct.

qingcharles · 8h ago

I came up with the first CD DB for Windows 95 just before it came out. I realized Media Player had an .ini file with the track names and so I had people on Usenet send me all their listings and would reintegrate it and publish it frequently for the first few weeks until I realized .ini files had a 64KB limit and that was the end of that.

If I did it again, which I have planned for a long time, I would require citations for every track listing. Sure, it's a big barrier, but it'd nice to get it right where possible. The primary citations would generally be to the album cover, but in cases like the Aphex Twin insanity, cites to things like interviews and label demo releases etc could definitely be valid.

JohnFen · 19h ago

> MusicBrainz's (former?) policy where everything must be Title Cased regardless of how it's presented on the CD sleeve.

Is that why that happens? It was always a baffling thing to me and required manual correction (and is one of the sorts of errors that made MusicBrainz less useful).

pavon · 18h ago

Part of the difficulty is that artists/labels aren't always consistent about the formatting of song titles. Its not uncommon for the capitalization to vary between the back cover of the CD, the printing on the CD itself and the liner notes. And then you have variations between releases of the same CD, and digital releases where the file metadata, and the store listing, and the artist website also all vary. So I can't blame MusicBrainz for choosing to normalize by default. Ideally, you could use normalized case for the Recording and Work song titles, and then stylized for the Release song titles, but most people don't go to that level of detail when entering songs.

JohnFen · 16h ago

Oh, I understand the problem, and I don't blame them either. However, it is a part of why these services stopped being useful to me.

amiga386 · 20h ago

> policy where everything must be Title Cased regardless of how it's presented on the CD sleeve

If the music artist decided how it should be on the CD sleeve, and you can show that, then you can go with that. But more often than not, the sleeve is done by the record company's graphic designers, not the music artist.

https://musicbrainz.org/doc/Style/Titles

> Album and song titles are often found in upper‐case on the back cover of CDs. For example, the album Songs of Love and Hate is written as “SONGS OF LOVE AND HATE” on the cover. This is usually the choice of a graphic designer, not the artist. So, instead of copying the title from the cover, we follow certain rules to capitalize a title.

https://musicbrainz.org/doc/Style/Principle/Error_correction...

> Error Correction: There are many cases of record companies incorrectly reproducing titles or even artist names, or breaking generally accepted rules of usage for stylistic purposes. In such cases it often makes sense to fix errors and standardize irregularities, valuing correct spelling, punctuation and grammar over faithfulness to the printed release cover.

> Artist Intent: Artists sometimes choose to present names and titles in ways that deliberately contradict the rules of the language they're in (e.g. unorthodox spellings) and/or the MusicBrainz Style Guidelines. To describe the way we handle such choices, we use the term "artist intent." The general idea is that if an artist intended something to be written in a special way, then MusicBrainz should follow that intent. Unfortunately, it can be difficult to find out what an artist intended. If you want to claim that some deviation from the Style Guidelines should be considered artist intent, the burden of proof lies on you.

ItsHarper · 15h ago

Seems reasonable. I'd think this should be pretty straightforward for songs new enough to be released online. If it's capitalized a certain way on Spotify, that's almost certainly what the artist intended.

Avamander · 20h ago

I can't recall when something like that was enforced. Artistic intent is definitely something that editors and guidelines intend to preserve. Though in some cases it might be hard to determine if something is a mistake or intentional - there are incredibly weird releases.

pflenker · 21h ago

Somewhat related: some conscious artistic choices - such as writing down two tracks but delivering them as one (not sure if this is what happened here) can’t really be transferred into databases.

I own a cd where one track name is a small icon depicting a heart stabbed with a rather lengthy knife. To my knowledge, this track has no canonical name. Any digital version of this cd betrays the respective author‘s interpretation of the icon.

And then, of course, there’s „Love Symbol“: https://en.wikipedia.org/wiki/Prince_(musician)

FearNotDaniel · 3h ago

> can’t really be transferred into databases

Of course they can, it's up to the person designing the database schema to anticipate what is a common artistic practice and model the data accordingly. It might be that specific databases like MusicBrainz and Gracenote haven't accounted for that, but if you own the schema you can easily set up a one-to-many (parent/child) relationship between physical track and song name.

One extreme example of this would be the "Lovesexy" album by (the artist formerly known as) Prince, which in its original CD form had only one track, containing 9 songs. I think the Spotify version is still faithful to this.

This and many other common "conscious artistic choices" ought to be collected into a "Falsehoods Programmers Believe About Recorded Music", if that is not already a thing...

In your example above, yes it's true that many song titles and artist names are fully and partially graphic symbols with no direct text representation (another thing Prince was fond of), but again given the prevalence of this there's no reason a smart data schema couldn't model a song or artist having a 'canonical' name that can only be represented by some graphic format along with one or more pronounceable/text-encodable alternatives (TAFKAP/Love Symbol) and so on; and of course tracking the fact that the 'preferred' identifier can change over time (Cat Stevens/Yusuf Islam, to mention a non-Prince example).

indrora · 19h ago

How about "Naming the CDs"

There's a handful of albums that MusicBrainz doesn't quite have the right cd naming for since one was labeled "LEFT" and the other "RIGHT" and not 1/2 -- there is no canonical 1/2 order.

Sniffnoy · 20h ago

What's the CD?

sandreas · 11h ago

Maybe for archival Purposes you could use `redumper` (https://github.com/superg/redumper) to prevent ripping mistakes.

My personal workflow:

  - rip the audio CD via EAC with acousticID (flac)
  - retrieve metadata via beets in a script completely automated
  - convert flac to mp3 via beets inplace convert (see below)
  - backup the flac files to another location
  - self-host navidrome and use the substreamer / dsub app and smart playlists to listen "on the go" (The Apple usb-c-to-audiojack adapter is pretty decent)
  - transfer this via iTunes VM to my good old iPod Nano 7g as main listening device for audiobooks

If anyone is looking for fast and accurate ripping hardware, recently I updated my recommended hardware list including a linked tutorial for EAC:

https://pilabor.com/blog/2022/10/audio-cd-ripping-hardware/

beets convert config:

  convert:
    auto: no
    ffmpeg: /usr/bin/ffmpeg
    opts: -codec:a libmp3lame -qscale:a 0 -ac 2 -ar 48000 -map_metadata 0 -movflags use_metadata_tags
    max_bitrate: 192
    threads: 1

maeln · 7h ago

~Why use MP3 instead of opus, vorbis or AAC ? All of them have (most of the time) better compression ratio (and better quality) than MP3. Is it for compatibility reason ?~

edit: Ah, I missed the ipod nano part

sandreas · 2h ago

Just compatibility and "high enough" quality. Works in my car, on my iPod, on my Phone, on my kitchen radio and is the most common format in general.

nani8ot · 6h ago

iPod Nano 7th gen. does support AAC (AIFF & WAV too).

kevin_thibedeau · 21h ago

There's always going to be outliers but I find MusicBrainz pretty useful. I note that a lot of CD-text has poor application of title capitalization and MB usually has it in a more rational form. My ripping system presents a choice when both are available and I usually pick MB. There's also the benefit that the MB database is Unicode and CD-text is whatever the authoring tool used which is usually CP1252 but sometimes not.

Asmod4n · 22h ago

CD Text is a thing, sadly no major label is using it anymore to embed metadata into their records so such a thing like MusicBrainz wouldn't be needed.

Sony was a big supporter of it ~25 years ago.

trentnix · 18h ago

For the younger crowd: fancy head units (that's what we called the essential aftermarket CD player/receiver in the dash of your vehicle) would show you CD Text with artist, album, and track name. It would melt the brains of your friends when the name of the song that was playing would scroll by on an old-school, single- or multi-line LCD display. It was a massive flex in its day.

Good times...

badc0ffee · 17h ago

My 2006 Toyota had that. What I really wanted was an aux port, or even a cassette deck I could use with an adapter to plug in my iPod. Instead I had to make do with a FM transmitter plugged into the cigarette lighter.

rhinoceraptor · 16h ago

My 2017 Focus ST still has a CD player with CD text, and I actually do listen to music on CD in it, the bluetooth quality is noticeably worse for whatever reason. I got my first iPod in about 2007 in middle school, and I only ever had about 10-20 CDs growing up, but I started getting into CDs about a year ago. It seems like there is a minor resurgence now that vinyl is expensive, since CDs still cost the same as they ever did, and a lot of them are cheaper even without inflation. I picked up a copy of Pretty Hate Machine at a Walmart for $8 the other day.

asciimov · 15h ago

Of course Sony was, because they own the patent for it.

The reason other labels, and most cd units, don’t use CD-Text is companies don’t want to pay for the license.

Henchman21 · 20h ago

When I was building out infrastructure to support streaming at Sony Music Entertainment, it was well known that interns would input the metadata. Typos were rife and genres? Made up out of whole cloth.

It feels safe to assume that the situation has improved since then, but I doubt seriously we’ll ever be free of typos ;)

JohnFen · 16h ago

> genres? Made up out of whole cloth.

The problem with genre remains entirely unsolved across the board. The solution I use in my collection is to do what everyone else seems to do: make them up out of whole cloth. Because I'm the only one making them up, it means my labeling is at least internally consistent.

TylerE · 7h ago

The biggest issue with genres is most databases treating them as one to one rather than one to many.

JohnFen · 32m ago

That's a real issue. I think the biggest issue with genre, though, is that even if people agree on a list of possible genre labels, there is often disagreement about what music belongs in which genre.

This isn't a new problem at all. Even music labels often disagree. Back when record stores were a thing, it was pretty common for different stores to categorize the same albums differently in terms of genre. I think the only way to avoid it is to stick to very, very broad categories. "Rock", for instance, covers an amazingly broad set of styles.

Henchman21 · 14h ago

I will admit that I do precisely the same with my collection! But I truly felt that those interns should’ve received a list to choose from, not an open text field.

lloydatkinson · 20h ago

It's sad Sony put the effort into writing rootkits for music CD's but did nothing to automate, flag, fix typos for metadata...

mxuribe · 17h ago

I remember the Sony rootkits...Since then and to this day, i avoid buying anything related to Sony as best i can. Funny thing is, folks who know me know that i am not the kind of person who holds a grudge....but something about that rootkit event really brought the ire in me....one of the extremely few times where i held a grudge. So, i avoid Sony and go on with my life.

I also stop buying at other companies...but for other companies for some reason i don;'t hold onto the ire...i just stop buying from them, and quietly move on...but Sony....i don't get it, but the dislike is crazy.

Henchman21 · 16h ago

I recall a meeting where my team was asked to do some technical legwork for the implementation. To his credit, my boss stood up, said some words about ethics, and led our team out the door. It wasn’t the entire org… just the music business folks as I recall. I left shortly thereafter.

qingcharles · 15h ago

I was doing digital ingestion for Sony in Europe and they sent us all those hobbled "unrippable" CDs and asked us to rip them for streaming. They were kinda embarrassed about it.

Henchman21 · 14h ago

Man, I bet you guys went through a couple boxes of sharpies getting those ready to rip! :)

mxuribe · 15h ago

I highly commend you, your boss, and any others who stood up or otherwise rebelled against the despicable Sony leaders who wanted this to be done. I can only imagine that it would not have been easy. My appreciation goes out to you, your boss, and the rest of the team...and i only wish there were more folks like you in the world! For that, thank you sincerely!

Henchman21 · 18h ago

Agreed. I could say tons here, but it’ll suffice to say that I am wildly happy I no longer work there!

vkaku · 6h ago

Perhaps I should create an overlay for MusicBrainz with sub-minute lag called ZombieBrainz.

If you own a CD and send an edit with a $5 donation, it goes on volatile and nightly; It can go to beta instantly for $100 donations and if not it'll have to be flagged for violations. If it needs to happen instantly on stable, $10000 (generous patron tier, where I will write a blog post for this entry as well) else get to it in 3 months.

piperswe · 22h ago

> Edits on MusicBrainz spend 7 days in limbo after they're created

Not all edits, just major ones (e.g. name changes). Minor edits usually get auto-accepted.

Avamander · 20h ago

Faster if someone votes on the edit, which you can request on their IRC/Discord/Discourse if there's a need (like larger or dependant edits).

amiga386 · 21h ago

And just so people know, their edits were applied in March this year...

Edit #122458416 - Edit medium Vote tally: 0 yes : 0 no Status: Applied Opened: 2025-02-24 00:02 UTC Closed: 2025-03-03 01:00 UTC For quicker closing: 3 unanimous votes If no votes cast: Accept upon closing

infl8ed · 9h ago

Actually, and quite interestingly, it looks like their second edit (to separate the tracks) failed: https://musicbrainz.org/edit/122458694 Status: Failed dependency This edit failed either because an entity it was modifying no longer exists, or the entity can not be modified in this manner anymore.

Clicking through to the CD release we can see that it indeed still has those two tracks combined https://musicbrainz.org/release/af4dc096-65d2-4cc5-9e0c-176d...

egypturnash · 22h ago

Damn, MusicBrainz is still running?

"MusicBrainz is operated by the MetaBrainz Foundation, a California based 501(c)(3) tax-exempt non-profit corporation dedicated to keeping MusicBrainz free and open source." - the gloriously retro-looking front page

piperswe · 22h ago

Still running and still doing great! Some of us still curate a local music library instead of streaming ;)

egypturnash · 18h ago

I curate my own library too but it's pretty much all off of Bandcamp. I don't even own a CD drive I could rip with any more.

pavon · 18h ago

Even with digital releases, MusicBrainz often has more detailed metadata than the original files. And if you have a mixed library of rips and digital purchases, it is nice to use a tagger like Picard to enforce consistent directory structure and filenaming.

masklinn · 21h ago

You can curate a music library without ripping CDs tho.

ssl-3 · 5h ago

I can curate my own library of bookmarks within [some other body's music library] without CDs; of course I can.

I can do that with iTunes or Spotify or Tidal or Amazon Music or whatever else.

But none of these bookmarks are necessarily related to my music. They are only just bookmarks that refer to music that might exist within the libraries that these bodies provide.

And while all of these libraries are certainly quite vast, there's a fuckton of (published!) music that these commercial libraries do not provide.

JohnFen · 21h ago

Depends on your musical tastes. A good 25% of the music in my library is not available in any form other than used CDs.

dwedge · 20h ago

What kind of music?

mtillman · 20h ago

Not sure about OP but I have all manner of blues and jazz recordings unavailable via streaming. There are also lots of obscure Japanese game and rock recordings that aren't in Apple or Spotify though to Spotify's credit, they have a lot of game content. Streaming is mostly in service of licenses and margins which as a shareholder, that makes sense to me.

seba_dos1 · 1h ago

Even local super popular rock bands from 80s don't always have their entire catalog available on streaming services, and solo endeavors of their musicians are often nowhere to be seen there.

shermantanktop · 12h ago

People seem to assume that any decent creative output always gets carried forward to the next form of media tech. But there are 78s that didn’t make to LP, much less anything after that.

mtillman · 10h ago

Another example: Nick and Nora pre code films weren’t on Netflix the last time I looked.

JohnFen · 20h ago

A wide range, actually. It's more about the time period and artists than musical style. If it's earlier than the 90s and/or from an artist who wasn't big on the charts, it gets more likely that they're not available except on used CD.

In that sense, the depth and variety of good music that is available has been shrinking for a long while now. The advent of streaming seems to have made it worse.

PaulDavisThe1st · 15h ago

By contrast, before I got rid of almost all my vinyl, one particular sub-collection that I had was about 200 12" singles from the London club scene in 1981-1985. Almost none of the tracks ever appeared on CD or were ever released digitally.

All of them were available on youtube, even the whitelabel DJ-only releases!

ZeroGravitas · 21h ago

MusicBrainz has (or at least had) an acoustic fingerprint system for processing audio files too.

Avamander · 20h ago

This is the part that tends to have the most mistakes, if used. It's generally better to provide minimal info manually if the CD wasn't identified by its ID.

piperswe · 19h ago

Indeed! About half of my new music acquisition is on CD, the other half is Bandcamp/Qobuz/7Digital.

OkayPhysicist · 21h ago

Seeing a Mastodon link on a clearly hand-written HTML site is neat.

cloud8421 · 20h ago

I use MusicBrainz and donate every month - yeah data is not perfect, but you can go and fix it yourself if needed, and the UI is extremely functional without any frills.

Sniffnoy · 20h ago

Man, I thought this was going to be about a decoding tool that had some edge case incorrect, but instead it was just about incorrect entries in a database that was used in place of actually decoding...

alexchantavy · 13h ago

Wait so back in the day I remember Winamp let you configure a CDDB thing and it connected to something called.. Gracenote? (Am I remembering that correctly?) iTunes desktop at some point used to handle this all for you and I assumed it was pulling from those sources under the hood. Where did MusicBrainz come from?

CharlesW · 12h ago

https://en.wikipedia.org/wiki/MusicBrainz

”MusicBrainz was founded in response to the restrictions placed on the Compact Disc Database (CDDB), a database for software applications to look up audio CD information on the Internet. MusicBrainz has expanded its goals to reach beyond a CD metadata (information about the performers, artists, songwriters, etc.) storehouse to become a structured online database for music.”

More detail here: https://courses.cs.umbc.edu/771/papers/ieeeIntelligentSystem...

shermantanktop · 12h ago

Gracenote is alive and well, and mostly supporting the video entertainment industry I think but with forays into adtech and other such schemes.

0points · 11h ago

Well written. I am dealing with some similarities setting up a jellyfin server and the episode data of some series are rather incomplete.

So I been contributing to tmdb for the last half year or so :)

rconti · 19h ago

> Aside from some audio tracks and a table of contents over those tracks, very little extra information is included on a disk - you've pretty much only got the artist name, album name and track names actually burned into the disk.

Huh, I actually didn't think there was any metadata at all.

KwanEsq · 16h ago

Yeah audio CDs do (or at least can) carry those bare bones of metadata, which can be used by some CD players with built-in displays to display the currently playing track title etc.

It's defined by the CD-Text extension[0] to the Red Book standard.

I think classical releases probably make greater use of it to encode things like composer and arranger, since they are more important to that audience, but for the average popular music release you're only going to get the artist and title, and maybe the ISRC that few are going to care about/display anyway.

[0] https://en.wikipedia.org/wiki/CD-Text

dhosek · 19h ago

Yeah he goes on to talk about an external data source for metadata, so this statement is, as far as I know, wrong, even by the standard of what’s in this article.

rconti · 19h ago

no, there's a part about it later, assuming we can take their word for it: (ugh, HN formatting is the _worst_)

------

Taking a look at the metadata embedded into the disk itself, we can see that track 6 is actually titled "Don't Need a Reason" on there:

FILE "./06. Finish Ticket - Nothing Coming Soon.flac" WAVE

  TRACK 06 AUDIO

    TITLE "Don't Need A Reason"

    ISRC USDPK2300133

    INDEX 01 00:00:00

dhosek · 19h ago

I had always thought that the odds of doubled discs based on the TOC were unlikely, but it turns out that with discs with fewer tracks (≤4 or so), you can get duplicates quite easily.

devmor · 19h ago

I wonder if this explains why som EPs I have received as ZIPs from friends get tagged incorrectly in programs like Jellyfin.

b0a04gl · 6h ago

that part in the blog where OP mentions CDs don’t store song names at all hit me hard. all these years i thought my old sony player was just lazy, turns out the format never even tried. whole childhood spent memorizing track numbers just to play one damn song.

KwanEsq · 20h ago

Huh, once I saw the image with the discrepancies I immediately assumed 'ah, "Nothing Coming Soon" must be in the pre-gap of "Don't Need a Reason", especially with that track length, and the rip combined that into one music file', but no, turns out it just isn't defined in the disc metadata at all. Wonder if that's a (mastering?) error, given that the TITLE metadata doesn't even include it.

dogman1050 · 14h ago

I've ripped hundreds of CDs and the metadata is usually ok on commercial discs. When ripping CDs I created from LP rips, I use Mp3tag to make it right.

JohnFen · 22h ago

MusicBrainz and CDDB have become error-ridden enough that I've essentially stopped bothering with them and have switched back to just entering the information manually.

dawnerd · 22h ago

It's worse if you're ripping foreign audio. I got a bunch of discs from Japan which I would assume, being Japan and all, there would be excellent data online. Wrong. Every single album got matched to something else.

Even accurip was incorrect. I pretty much don't trust any of the online data sources anymore and just manually enter meta.

And don't do what I did... don't just lets beets run unattended. What a pain that was.

qingcharles · 15h ago

I was doing audio + metadata ingestion for the major labels and they sent us a truck load of East Asian CDs of different languages, and here's me with a team of poor minimum wage high school grads looking at me all crazy.

JohnFen · 22h ago

Yes, you're right. Also, with obscure or rare CDs. If they're in the databases at all, the odds are better than 50% that the data is incorrect to some degree, or they are confused with completely different albums.

jeffbee · 22h ago

Isn't that still a labor-saving starting point?

jandrese · 21h ago

Depends how long it takes you to figure out what the problems are and fixing them.

Debugging is usually harder than coding, and the amount of data we are talking about is fairly small. Just typing it in could easily be faster.

setr · 22h ago

the problem with false positives is that a single instance means you have to review every record meticulously, because you have no idea where the system has lied to you, or how many times (because the system itself doesn't). If you're going to review everything anyways, it's often better to simply be slow and correct to begin with rather than diff and correct every item.

this is why it's usually better to be overaggressive with saying "I don't know" rather than crossing your fingers and shitting out an answer and hoping you get away with it.

dylan604 · 20h ago

When did we switch the conversation to LLM issues? =)

One of the devs for a company I used to work shocked me when he said "bad data is better than no data" when inquiring about why the input field was limited to a drop down of pre-filled values that were irrelevant with no way of filling in correct data. At that point, I just felt the entire database was suspect

pavon · 18h ago

It depends. I'd like to argue that you have to enter the information one way or another, why not share it and save others the work in the future, but in reality it is often quite a bit slower. MusicBrainz likes to collect more information than a normal CD riper would ask for, with more pages to click-through, so that is a bit slower. However, the main annoyance is when you have to make a correction that isn't auto approved, and then you have to wait 7 days before your tagger/ripper software will see changes you made. I wish there was a better workflow to tell Picard to use a pending edit[1].

I still always use MusicBrainz, and enjoy contributing to it, but more like others enjoy contributing to Wikipedia, rather than as an efficiency boost.

[1]https://tickets.metabrainz.org/browse/PICARD-1278

JohnFen · 22h ago

Often not, because it's less effort to type the information in fresh than to review and edit the existing information.

I'm not saying the services are always overly incorrect, just that they're incorrect often enough that the path of least resistance was to stop using them.

dylan604 · 20h ago

Plus, it gave me something to do while the CD was importing rather than just pushing into the background while I started working on something else and promptly forget about the import.

lksaar · 22h ago

your best shout for jp cds is hoping someone added them on discogs

bananalychee · 17h ago

I think about half of the Japanese albums I tag have a mistake of some sort on Discogs, such as wrong okurigana or kanji usage. I've corrected some of them myself, but it happens so often that I've mostly given up. In the end it's faster to transcribe from the back cover.

GauntletWizard · 22h ago

I just ripped a small collection (only ~200 discs), and I encountered all of the problems that have been complained about in this thread. I still used Musicbrainz, because it was easier for me to double-check and fix the entries in their DB than to manually type all the data myself.

When bandcamp releases were available but nothing was in the database, I found it quick and simple to copy+paste the track listing into MB and create a new release. Combining it with the TOC I'd already been searching for, I got perfect rips every time without much issue.

Even with a significant amount of time double checking and fixing the metadata, I consider it a good use of time. I was not simply ripping my CDs, I was helping maintain the historical record.

mayneack · 20h ago

There are userscripts to automatically do this from sources like bandcamp: https://musicbrainz.org/doc/Guides/Userscripts

JohnFen · 19h ago

> I was not simply ripping my CDs, I was helping maintain the historical record.

That was how I felt about it in the earlier days, when I'd actively participate in updating/correcting the databases. I stopped feeling that way years ago, though. Right or wrong, it felt like a losing battle as so many corrections were never actually adopted.

cloud8421 · 20h ago

> Even with a significant amount of time double checking and fixing the metadata, I consider it a good use of time. I was not simply ripping my CDs, I was helping maintain the historical record.

This is the spirit - I've started doing the same for releases that don't appear in MusicBrainz and it feels great knowing that I'm not just doing this for myself.

al_borland · 21h ago

Was there a period where it was good? I tried in back around 2001 or 2002 and it produced a mess. I swore it off and figured it wouldn’t be around long. Here we are over 20 years later hearing that it’s too error-ridden to use.

jandrese · 21h ago

These days something like MusicBrainz is effectively a legacy system. So few people buy CDs anymore that there's not a lot of interest in maintaining it. It's fairly hard to even find a computer with an optical disk reader these days, especially if you are looking at laptops.

cloud8421 · 20h ago

Note that the scope of the project goes beyond CDs, it's a catalogue for pretty much any format where you can play music.

Avamander · 20h ago

It's used as the basis in a _lot_ of places. So fixing errors fixes them in a lot of other websites (and infoboxes).

JodieBenitez · 21h ago

Never worked fine for me, at least not fine enough to trust it.

riansanderson · 21h ago

tangentially related- does anyone have a good recommendation on an external CD drive that works well with macOS and has a good form factor and build quality?

I have an ancient thinkpad that I use a couple of times a year _just for reading cds_ and and have considered retiring it. But all the CD drives I see on amazon look like disposable crap.

dawnerd · 21h ago

Pick up an internal drive and get a good enclosure. Way better than any of the external junk on Amazon. Better yet get one of the LG bluray drives that support ripping 4k discs. Might need to flash the firmware. That’s what I use and it’s great and really fast for plain cds as a bonus.

aspenmayer · 17h ago

> Might need to flash the firmware.

I’m a fan of LibreDrive, but have you heard about any similar firmwares for this purpose?

More info about LibreDrive on the forum that hosts discussion about it and tools that it works with:

https://forum.makemkv.com/forum/viewtopic.php?t=18856

dawnerd · 14h ago

There was a thread for the drives model which I don’t have near me at the moment. Was a kinda sketchy windows app but now makemkv shows libredrive.

TheAmazingRace · 21h ago

Anything made by Pioneer these days is a good choice. That said, Pioneer just recently exited the optical disc drive market a month or so ago, so you'll want to pick up a drive while you still can. They tend to be pricier than your generic external disc drive, but they are dead reliable, and fully compatible with software like EAC and XLD.

I have the Pioneer BDR-XS07S slot loading external BluRay burner drive and it does a great job ripping audio CDs.

dhosek · 19h ago

I bought this Pioneer drive

http://www.amazon.com/exec/obidos/ASIN/B0BN66KFV1/donhosek

last year after having two consecutive drives crap out on me with both not wanting to eject discs or acknowledge discs that were in the drive and it has worked perfectly for me for this year. It has my strong endorsement..

echelon_musk · 21h ago

When I wanted one for ripping music CDs to my M1 Mac I bought the cheapest used USB to CD/DVD drive on eBay. It's a LITE-ON eUAU108 and hasn't failed me.

eisa01 · 20h ago

I tried buying a noname drive from AliExpress, and the drive wouldn't rip correctly with XLD...

You could rather salvage the drive from an old MacBook, works great with a cheap adapter

Synaesthesia · 21h ago

The Apple superdrive

giantrobot · 17h ago

While I've ripped hundreds of discs with mine, they do have some downsides. It can be a bitch and a half to get a disc out if it can't be read properly. Even drutil wouldn't eject such discs.

There's also no way to use mini CD/DVDs with them. Not that those were ever super popular but if you have any it's an annoyance.

I replaced my SuperDrive with an 5.25" internal drive in an external powered enclosure. I can always get unreadable discs out easily, have no problem with mini discs, and I'm not stuck with an extremely short USB cable.

A SuperDrive isn't a bad option but there's better available.

k__ · 15h ago

I remember having a game CD+R.

It had scratches and even holes, but somehow it worked, lol.

TylerE · 7h ago

Scratches effect much less than most people think, as long as they're superficial. (For instance, dings from the case are likely fine, but run the tip of a key across the one...)

Contrary to what most people expect, the data pits on a CD are much closer to the label side than the shiny side - the bottom of the disc is a clear plastic layer that as far as the optics of the drive are concerned are out of focus.

dd_xplore · 16h ago

I see a lot of praise for MusicBrainz, is it really that good?

at_a_remove · 17h ago

I am keeping an eye on this thread, as I plan to eventually rip my somewhat large collection, but would prefer to do it just the one time.

Exact Audio Copy, the author seems to have moved on to other interests, which is a shame because I was looking for something compatible with an autoloader. And it looks like dbpoweramp is the only one left in that arena.

I am allllll about the metadata. Also, a thumbnail, synced lyrics if they could be found, custom metadata for hyperlinks back to entries on Discogs and MusicBrainz, perhaps some ReplayGain values in fields on the FLAC, depending on my MP3 processing case ... but I have so many unanswered questions.

MyPasswordSucks · 12h ago

> Exact Audio Copy, the author seems to have moved on to other interests, which is a shame because I was looking for something compatible with an autoloader.

Nah, it's mostly just reached the stage where there's nothing left to do - all the "objective" stuff works as it should, and any feature adds would be a pretty heavy undertaking. It was updated a little less than a year ago, and when I contacted the author he was very responsive.

Would it be nice to have a keyboard shortcut for proper [1] cuesheet creation (ironically, all the options except the proper one have keyboard shortcuts)? Yeah, but I've learned to live with it. Would it be nice to have super-duper tagging options? I dunno, from where I'm sitting, it seems like it'd just be duplicating a bunch of foobar2000 features for negligible gain.

[1] Because nobody wants a .FLAC that starts with a few seconds of silence, inter-track gaps need to be appended to the end of the previous track, which is not how Red Book audio handles it, and means that the "proper" cuesheet format is technically a non-compliant cuesheet.

Do Metaprojects (taylor.town)

Stop One-Shotting Your Code or Get Left Behind (algarch.com)

Issues with Stream Live, Stream VOD, Durable Objects, R2 and Workers Builds (cloudflarestatus.com)

1999 email from Jef Raskin helped me think about Apple's WWDC (fastcompany.com)

UK unis cough up £10M on Java to keep Oracle off their backs (theregister.com)

Openfire 5.0.0 Beta Released – Open-Source – Java XMPP/Jabber Server (discourse.igniterealtime.org)

Show HN: I built an AI Voice assistant to debrief me on what matters every AM (dayli.xyz)

Some of the Oldest Living Creatures Are Getting Crushed by Cruise Ship Anchors (studyfinds.org)

Premium accounts to fund the matrix.org homeserver (matrix.org)

A Son, a Scientist, and the Secret of Bioluminescence (thewalrus.ca)

James Wynn Goes Long on PEPE Hours After Losing $100M on Leveraged Bitcoin Bet (coindesk.com)

Painting the Internet: A Different Kind of Warhol Worm (2006) (ucalgary.scholaris.ca)

Google rejects app store age verification for online content (techxplore.com)

How do axolotls regenerate limbs and organs? A researcher is cracking the code (news.northeastern.edu)

Digital censorship of intimate health: women's issues undesirable (heise.de)

OxCaml (blog.janestreet.com)

OxCaml - a set of extensions to the OCaml programming language. (oxcaml.org)

I'm not back yet, but I'm looking for where to go next [video] (youtube.com)

Lyra Zero W Packs RK3506B and Wi-Fi 6 into Raspberry Pi Zero-Sized Board (linuxgizmos.com)

Deprecate TF + JAX (github.com)

Apple gets over its hang-ups, and the iPad enters a new era (sixcolors.com)

Are cells a good analogy for software? (neilmadden.blog)

How to Redraw a City (worksinprogress.co)

Bluesky, Cuban says 'lack of diversity of thought' pushing users back to X (fortune.com)

Is Gravity Just Entropy Rising? Long-Shot Idea Gets Another Look (quantamagazine.org)

Sole Survivor (2013) [video] (youtube.com)

Is Google about to destroy the web? (bbc.com)

Show HN: Tattoy – a text-based terminal compositor (tattoy.sh)

Mechanize is building AI tools to automate white-collar jobs (nytimes.com)

Deploy your own Docker registry and builder in 3 commands (knot.deployto.dev)

HTML WARDen (A Wiki) (ratfactor.com)

Show HN: Your Technical Voice

Jacob's Phone Simulator (jacobfilipp.com)

Why the age of AI is the age of philosophy (theendsdontjustifythemeans.substack.com)

When pop music went supernova (scottsumner.substack.com)

The Army's Newest Recruits: Tech Execs from Meta, OpenAI and More (wsj.com)

Show HN: qrkey - Offline private key backup on paper (github.com)

Futarchy's Fundamental Flaw (dynomight.net)

Your Barbershop Doesn't Need Kubernetes (algarch.com)

Optimizing tea: An N=4 experiment (dynomight.net)

Fundamental skills will always serve you well (tsoon.com)

Wanted: Junior cybersecurity staff with 10 years' experience and a PhD (theregister.com)

Show HN: I built a tool to turn handwriting into a font with PyTorch/OpenCV (handfonted.xyz)

Makepad, a new way to build UIs in Rust for both native and the web (github.com)

The Development of an New Painkiller (newyorker.com)

My advice on (internet) writing, for what it's worth (dynomight.net)

Ask HN: How do I give back to people helped me when I was young and had nothing?

What happens when our brain goes blank (popsci.com)

The Same Old Fantasies Behind AI and New Technology (lawfaremedia.org)

Why does my ripped CD have messed up track names? And why is one track missing?

Comments (125)