Enhancing MySQL: MySQL improvement project

53 bratao 39 6/1/2025, 1:24:56 AM github.com ↗

Comments (39)

nubinetwork · 1d ago

Why isn't this being upstreamed? I don't feel comfortable using some random patch someone found on github...

voodoo_child · 1d ago

One of the big problems is Oracle decide what does/doesn’t go in. Understandable, but they provide no insight into the decision making process, why patches don’t get merged, or even if/when they take it. They used to post worklogs for features they were working on which gave some insights, but they’ve stopped doing that now too. Imagine working on a new feature for months, submitting a patch only to find out oracle have went their own route, many people will just say what’s the point. Last updated work log was in 2021: https://dev.mysql.com/worklog/

Edit: As an example, one of the optimizations called out in this repo was submitted in October 2023. “Thank you for the report and contribution.” and then radio silence ever since.

https://bugs.mysql.com/bug.php?id=112737 https://www.percona.com/blog/what-oracle-missed-we-fixed-mor...

arp242 · 8h ago

From FAQ:

16. Why aren't these improvements merged into the official MySQL?

Optimizations have been recommended to the official team and have received acknowledgment. However, they are assigned low priority in official bug fixes. Simple optimizations may take considerable time to be integrated, while complex ones might never be implemented.

As a result, the decision was made to open-source the MySQL optimized version to ensure effective application in high-end scenarios.

erulabs · 1d ago

Sort of wild that a small improvement to relay log processing could almost certainly offset one’s entire lifetime of carbon. I mean, I’m genuinely happier with a tiny latency reduction but it’s still wild the scale at which MySQL operates.

Maybe these optimizations can let me avoid moving to Vitess for another year!

ksec · 1d ago

>Maybe these optimizations can let me avoid moving to Vitess for another year!

Any reason why considering Vitess isn't exactly new and has been stable enough? Other than no need to introduce additional complexity unless absolutely necessary.

3cats-in-a-coat · 1d ago

Keep in mind optimization effects can be counter-intuitive, as you need to consider unexpected second-order effects. Say what if my queries being slow forced me into optimizing them via cache that I wouldn't use otherwise, resulting in 10x improvement, but if MySQL is a bit faster I would've reached my initial performance goals without that cache, thus increasing the total carbon footprint?

And if this sounds contrived, this is basically what happened with our hardware vs. software optimization situation. We could do wonders on a 1MHz chip with 2MB of RAM in the 1980s, but now we need literally many thousands of times that capacity just to boot our OS to an empty screen.

Every time hardware improved, software bloated up. Thus eventually we had so much disposable compute just for... again, literally... playing games and crypto scams, that we invented AI running on it. And now that AI is once again blowing up our energy needs.

All that, because hardware kept optimizing, software kept compensating by becoming worse, and thus new use cases revealed themselves that would be impossible before, but rather destructive to climate.

jcgl · 1d ago

I would say that this is basically an example of the Jevons Paradox: https://en.m.wikipedia.org/wiki/Jevons_paradox

Having a cheaper, more available resource increases overall utilization of that resource.

3cats-in-a-coat · 1d ago

I often think this is the crux to the Great Filter. To overcome our natural, purely economic behavior, such as this paradox, and act intelligently considering long-term effects and sustainability.

Because it's easy to just drift along with the tide. Even bacteria follow those "economic" laws with regards to replication, food/energy, competition and so on. It takes no effort. And the only thing that keeps bacteria in check is that there is someone above them, limiting them.

We want to be on top, but we don't act with the according responsibility. We can learn this lesson through misery and pain. But this is also how bacteria learn. It'd be sad if we can't do better.

Notice all multicellular organisms exist through their cells implementing such restraint. Cells restrain themselves on function (gene expression) and on replication. When they stop doing that, they revert to pre-organism behavior, and we call it cancer.

We call ourselves a society, but we're only experiencing brief flashes of what this means. Insects like ants and bees are literally better at it, than we are. Lots of work to be done...

ncgl · 21h ago

Great comment.

I have similar thoughts relating to hive mind vs independent societies or progression under autocratic vs democratic societies.

The collective that can organize fully towards a goal is going to beat a collective that can't, or takes steps backwards every 4 years.

3cats-in-a-coat · 19h ago

Autocracy is one way to organize towards a goal, but since it's very narrow in its field of view, eventually it falls apart from the inside, because it can't balance the interests of the whole with the interests of its constituents individually.

A stable system has fractal stability as you go deeper. You can't make a car go faster than its parts can withstand before they fall apart. Even if you press the pedal really strongly.

Democracy is indeed a compromise - we cripple synchronization at a macro level, so that we get to enjoy some individual freedoms. But it also results in hidden structures of control, like corporations growing so large within a "free market" that they start buying power, and this feeds a cycle of autocracy that is even more toxic than the ideological kind, as it's entirely driven by the profit motive.

So essentially, we first need to escape the false dichotomy of autocracy vs democracy and think what transcendent paradigm includes positive elements of both, and some novel ones, but it includes less drawbacks from each. And I think we can mine nature and software architectures for inspiration, as they're rich with working models we haven't even tried yet at a social level.

But I don't think we're moving towards that. We just wobble between anarchy and fascism and somewhere in the middle is what we call "normal" (but ain't).

vasvir · 1d ago

Question: Does any of this optimizations apply to MariaDB?

iforgotpassword · 1d ago

Was wondering the same. Such a verbose readme and not a single mention of MariaDB.

I know it's moving slower and in a lot of ways it's inferior to mysql, but at the same time that would make it even better to have some contributions like this.

vasvir · 1d ago

That's not my impression. MariaDB looks to me that it has greater development velocity than MySQL.

Do not know about speed difference divergence though. I guess a speed run down would be interesting...

iforgotpassword · 1d ago

Not sure about the current state of affairs. But I think for example MySQL had json columns way earlier than mariadb, and to me they appear to have a nicer design/accompanying functions to work with them.

Also, I've run into a deadlock issue a while ago with mariadb. Skimmed their bug tracker and there are a couple similar reports, some remaining open, other getting fixes like "well I changed stuff around and I think it might solve the issue". An emerging pattern was that they think there might be a locking issue, but they're not sure, and any deadlock they can reproduce under rr (or whatever that recording debugger is called) is, according to them an issue with rr itself and a false positive. It gave me serious "we're just yoloing it" vibes. I understand it's not a trivial code base and they're probably understaffed, and I still prefer it over MySQL merely because of who owns that project but still, it left a weird impression.

femiagbabiaka · 1d ago

Is there somewhere I could read about the JOIN performance degradation in detail?

EDIT: I missed that the authors wrote a GitHub book, including some descriptions of the problem(s): https://enhancedformysql.github.io/The-Art-of-Problem-Solvin...

Side note: One downside of ChatGPT generated documentation (assuming this was written in conjunction with an LLM) is that humans tend to be a little less verbose.

ksec · 1d ago

I have been trying to Submit MySQL news to HN but never reached the front page.

After how many years they have finally released 9.0 and are now at 9.3. I wonder how many of problem stated in the list is still true.

At least Vitess still get continuous development.

mdaniel · 1d ago

> 14.7MB

You know, a patch file can individually address each upsteam file it intends to modify, right? I presume someone who wants to casually read them would need to fork the repo, cut up the ginormous .patch file into the 2361 individual patches for ease of reading or deep-linking

I also just for-real don't understand how in the universe a ~15MB text file against an open source _git hosted_ project is a sane way of delivering value. Not a single time in the readme did they say why $(git diff origin/tags/8.0.42...HEAD > yolo.patch) was the chosen delivery mechanism

throwdbaaway · 1d ago

https://github.com/google/mysql-tools/tree/master/old/mysql-... - Google used to do that too

mdaniel · 1d ago

Well, if Google does it then I guess I stand corrected about it being a weirdo way to deliver patches. They went so far as to .gz theirs, too, for extra non-browsing by mere mortals.

I find it curious that <https://github.com/google/mysql-tools/blob/02d18542735a528c4...> and yet <https://github.com/google/mysql-tools/blob/02d18542735a528c4...> says "diff -ruN base/client/mysqldump.c mysql40gpl/client/mysqldump.c"

I had no idea one could release patches of GPL software under an Apache license. That makes my head hurt.

theMMaI · 1d ago

For Google it's undoubtedly only done because under the license agreement they must make their source code modifications available if someone asks. A form of malicious compliance if you wish

fipar · 1d ago

MySQL is available in GitHub (so in that sense hosted) but development doesn’t happen there. Not saying that’s the reason for the delivery mechanism though.

lmz · 1d ago

Patches to MySQL are nothing new, but who's behind this patch set?

wejick · 1d ago

He is actually quite active on X, and writing his book [1] on the journey. I actually enjoy his insights too many time than I'd want.

[1] https://github.com/enhancedformysql/The-Art-of-Problem-Solvi...

santa_boy · 1d ago

Curious, Is there an easy way to do row-level and field-level security in MySQL?

fipar · 1d ago

Field-level is just column privileges. Row-level, I think you can only achieve that with views, which is less than ideal.

bawolff · 1d ago

Views?

user32489318 · 1d ago

I guess he meant to create a view of a table with “where” conditions depend on your user privileges

bawolff · 1d ago

Yes, using the DEFINER clause.

For all i know there could be other methods in mysql at this point, but views is how people have been doing fine grained row permissions in mysql for decades.

orionblastar · 1d ago

Do they have stored procedures yet?

rjh29 · 1d ago

There should be a blog post to point to people who have a 1990s view of MySQL. It has changed a lot and has most of PostgreSQL's feature set while being faster and simpler to use for the average user.

mdaniel · 23h ago

> and has most of PostgreSQL's feature set while being faster and simpler to use for the average user.

I would read that blog post, because I am firmly in the "mysql/mariadb is for people who like mongo" camp but I like learning new things

While I was delighted to see that 11 no longer just straight-up throws input data in the trash, I get a lot of mileage out of transactional DDL which seems to still be a PG feature

  docker run -d --name my -e MARIADB_ROOT_PASSWORD=sekrit docker.io/library/mariadb:11.4.7

  docker exec -i my mariadb -psekrit mysql <<SQL
  BEGIN;
  CREATE TABLE just_kidding (pk int);
  CREATE TABLE onoz (migrations are hard, yo);
  ROLLBACK;
  SQL

  docker exec -i my mariadb -psekrit mysql <<SQL
  SELECT count(1) FROM just_kidding;
  SQL
  count(1)
  0

While digging into its stored-proc story, I found these two gems

https://mariadb.com/kb/en/sql_modemssql/

https://mariadb.com/kb/en/sql_modeoracle/

which I would enjoy exploring more

evanelias · 13h ago

> I was delighted to see that 11 no longer just straight-up throws input data in the trash

You're essentially proving the upthread commenter's point here... the relevant setting is strict sql_mode, which has been available as an option for literally 20 years, and has generally been used by any serious MySQL/MariaDB shop for that whole time. Long ago it wasn't enabled by default out of the box, but it has been since MySQL 5.7 (released 10 years ago) and MariaDB 10.2 (released over 8 years ago).

> I get a lot of mileage out of transactional DDL which seems to still be a PG feature

Correct, MySQL and MariaDB do not support transactional DDL, and maybe never will. That's not a unique shortcoming though, as Oracle and SQLite don't support it either. MS SQL Server does support it, but if I recall correctly there are caveats depending on the isolation level in use.

Postgres clearly wins out on that feature, but as with everything in computing, it comes with serious trade-offs: a rather sub-par MVCC implementation [1], and lack of DDL support in logical replication [2].

I'm biased because I work in this space, but IMO it's easy to live without transactional DDL in MySQL/MariaDB if you pair a good schema management system (which allows you to test and lint DDL) with an online schema change tool (which allows you to throw away the shadow table if something goes wrong). And generally you shouldn't be running DDL by hand directly in prod anyway...

[1] https://www.cs.cmu.edu/~pavlo/blog/2023/04/the-part-of-postg...

[2] https://www.postgresql.org/docs/current/logical-replication-...

mdaniel · 12h ago

Well, I did say that I'd enjoy learning more, because I'll join in any good rant about autovacuum being some "oh gawd"

But, let's be real talk here: what kind of RDBMS ships with a flag named "strict sql mode" that's available to be set to something? Its reputation wasn't born from uncharitable twitter take downs, it was the kind of thing I had to try in order to know if it still did crazypants things like `CREATE TABLE foo (d DATE); INSERT INTO foo VALUES ('lololo')`. So, sure, I hear you about "it hasn't been stupid for 10 years" but don't lose track of the first part of that qualifier

I hear rumors that Mongo isn't insaney pants anymore, either, but I for damn sure ain't running that shit and will quit places that do

evanelias · 4h ago

> what kind of RDBMS ships with a flag named "strict sql mode" that's available to be set to something?

How about some of the most widely deployed database software on earth? For example SQLite doesn't even enforce column types by default today, and has only had the option to do so for less than four years!

In MySQL/MariaDB's case, yes they should have changed that default much earlier, but they historically over-indexed on backwards compatibility concerns around that time period.

gcbirzan · 22h ago

That comment was wrong, yes, but I'm actually curious why you find MySQL faster and simpler to use for the average user.

hu3 · 1d ago

> Do they have stored procedures yet?

MySQL has supported Stored Procedures since version 5.

That was 20 years ago.

No offense but I'm always curious about how people solidify easily verifiable missinformation as facts. I'm assuming most spread it out of ignorance, not malice.

OptionOfT · 1d ago

There is a whole new group / generation of people who ask questions, expecting a ready-to-consume answer, and no longer capable of doing the research themselves.

I feel that as soon ChatGPT & friends are going to start with native advertising, a whole new market will open up. And it won't be good.

kruffalon · 1d ago

Why would you put someone of at least 30yrs old (but probably more like 40yrs old as I can't imagine a 10-15yr old caring that much about stored procedures) into this elusive new generation that you obviously just want to bash for no apparent reason?

ksec · 1d ago

Partly to blame when MySQL hasn't been doing the right / enough marketing. And Generally hard to submit MySQL content everywhere.

Show HN: A toy version of Wireshark (github.com)

If you are useful, it doesn't mean you are valued (betterthanrandom.substack.com)

Ask HN: Who is hiring? (June 2025)

Show HN: Penny-1.7B Irish Penny Journal style transfer (huggingface.co)

Show HN: Kan.bn – An open-source alterative to Trello (github.com)

Ask HN: Who wants to be hired? (June 2025)

Piramidal (YC W24) Is Hiring a Senior Full Stack Engineer (ycombinator.com)

Mesh Edge Construction (maxliani.wordpress.com)

War and Wilderness: British Soldiers in Revolutionary America (historytoday.com)

Arcol simplifies building design with browser-based modeling (arcol.io)

How to post when no one is reading (jeetmehta.com)

Intelligent Agent Technology: Open Sesame! (1993) (blog.gingerbeardman.com)

Show HN: Onlook – Open-source, visual-first Cursor for designers (github.com)

The Visual World of 'Samurai Jack' (animationobsessive.substack.com)

Taurine Revisited (science.org)

A Hidden Weakness (serge-sans-paille.github.io)

Bohemians at the Gate? (inferencemagazine.substack.com)

How do I learn robotics in 2025?

Reducing Cargo target directory size with -Zno-embed-metadata (kobzol.github.io)

ReasoningGym: Reasoning Environments for RL with Verifiable Rewards (arxiv.org)

ThorVG: Super Lightweight Vector Graphics Engine (thorvg.org)

Cloudlflare builds OAuth with Claude and publishes all the prompts (github.com)

Younger generations less likely to have dementia, study suggests (theguardian.com)

The Princeton INTERCAL Compiler's source code (esoteric.codes)

Cuss: Map of profane words to a rating of sureness (github.com)

Show HN: Fast Random Library for C++17 (github.com)

I made a chair (milofultz.com)

Is “The Phoenician Scheme” Wes Anderson's Most Emotional Film? (newyorker.com)

EasyTier – P2P mesh VPN written in Rust using Tokio (easytier.cn)

The Atomic Airplane (whatisnuclear.com)

The rise of judgement over technical skill (notsocommonthoughts.com)

Show HN: MBCompass – Android Compass App (github.com)

LFSR CPU Running Forth (github.com)

LibriVox (librivox.org)

HeidiSQL Available Also for Linux (heidisql.com)

Hip: C++ Heterogeneous-Compute Interface for Portability (github.com)

Whatever happened to cheap eReaders? (shkspr.mobi)

What works (and doesn't) selling formal methods (galois.com)

Could floating solar panels on a reservoir help the Colorado River? (arstechnica.com)

Workers Want a Four-Day Week. Companies Should Too (wsj.com)

Root shell on a credit card terminal (stefan-gloor.ch)

Show HN: Moon Phase Algorithms for C, Lua, Awk, JavaScript, etc. (github.com)

TPDE: A Fast Adaptable Compiler Back-End Framework (arxiv.org)

In POSIX, you can theoretically use inode zero (utcc.utoronto.ca)

A new generation of Tailscale access controls (tailscale.com)

When Fine-Tuning Makes Sense: A Developer's Guide (getkiln.ai)

Nitrogen Triiodide (2016) (fourmilab.ch)

Show HN: I built an AI Agent that uses the iPhone (github.com)

After 25 Years, Linux Format Magazine Is No More (omgubuntu.co.uk)

Google AI Edge – On-device cross-platform AI deployment (ai.google.dev)

Enhancing MySQL: MySQL improvement project

Comments (39)