Xenharmlib: A music theory library that supports non-western harmonic systems (xenharmlib.readthedocs.io)

I always thought APL was written in the wrong direction. It writes like a concatenative language that's backwards--you tack things onto the front. NumPy fixes it by making the verbs all dotted function calls, effectively mirroring the order. e.g. in APL you write ⍴⍳100 but in NumPy you write np.arange(1, 101).reshape(10, 10).

trjordan · 1h ago

Seems like it could easily be training data set size as well.

I'd love to see some quantification of errors in q/kdb+ (or hebrew) vs. languages of similar size that are left-to-right.

fer · 31m ago

>Seems like it could easily be training data set size as well.

I'm convinced that's the case. On any major LLM I can carpet bomb Java/Python boilerplate without issue. For Rust, at least last time I checked, it comes up with non-existing traits, more frequent hallucinations and struggle to use the context effectively. In agent mode it turns into a first fight with the compiler, often ending in credit destroying loops.

And don't get me started when using it for Nix...

So not surprised about something with orders of magnitude less public corpus.

dotancohen · 13m ago

I realized this too, and it led me to the conclusion that LLMs really can't program. I did some experiments to find what a programming language would look like, instead of e.g. python, if it were designed to be written and edited by an LLM. It turns out that it's extremely verbose, especially in variable names, function names, class names, etc. Actually, it turned out that classes were very redundant. But the real insight was that LLMs are great at naming things, and performing small operations on the little things they named. They're really not good at any logic that they can't copy paste from something they found on the web.

weird-eye-issue · 8m ago

> I did some experiments to find what a programming language would look like, instead of e.g. python, if it were designed to be written and edited by an LLM.

Did your experiment consist of asking an LLM to design a programming language for itself?

dotancohen · 4m ago

Yes. ChatGPT 4 and Claude 3.7. They led me to similar conclusions, but they produced very different syntax, which led me to believe that they were not just regurgitating from a common source.

dlahoda · 26m ago

i tried gemini, openai, copilot, claude on reasonably big rust project. claude worked well to fix use, clippy, renames, refactorings, ci. i used highest cost claude with custom context per crate. never was able to get it write new code well.

for nix, i is nice template engine to start or search. did not tried big nix changes.

gizmo686 · 48m ago

Hebrew is still written sequentially in Unicode. The right-to-left aspect there is simply about how the characters get displayed. On mixed documents, there is U+200E and U+200F to change the text direction mid stream.

From the perspective of a LLM learning from Unicode, this would appear as a delimeter that needs to be inserted on language direction boundaries; but everything else should work the same.

cubefox · 28m ago

> Hebrew is still written sequentially

Everything is written sequentially in the sense that the character that is written first can only be followed by the character that is written next. In this sense writing non-sequentially is logically impossible.

dotancohen · 19m ago

An older Hebrew encoding actually encoded the last character first, then the penultimate character, then the character preceding that, etc.

Exercise to the reader to guess how line breaks, text wrapping, and search algorithms worked.

goatlover · 21m ago

Multiple characters can be written at once, they can also be done in reverse or out of order.

cubefox · 17m ago

No no, the second character you write must always be temporally preceded by the character you wrote first. Otherwise the second wouldn't have been the second, but the first, and moreover, the first would have been the second, which it wasn't.

dotancohen · 11m ago

I encourage you to find some place that still uses a Hebrew typewriter. When they have to type numbers, they'll type the number in backwards. And an old Hebrew encoding also encoded characters in reverse order.

vessenes · 1h ago

Interesting. Upshot - right to left eval means you generally must start at the end, or at least hold an expression in working memory - LLMs - not so good at this.

I wonder if diffusion models would be better at this; most start out as sequential token generators and then get finetuned.

jsemrau · 1h ago

Try it out? https://deepmind.google/models/gemini-diffusion/

yujzgzc · 1h ago

Humans can't either? I think if this convention had been more usable form of programming, we'd know by now

anonzzzies · 26m ago

Once you get used to it, traditional ways look tedious and annoying to me. I think the power is in 'once you get used to it'. That will keep out most people. See python llm implementations vs k ones as a novice and you will see verbose unreadable stuff vs line noise. When you learn the math you see verbose code where the verbose code adds nothing at all vs exactly what you would write if you could.

maest · 55m ago

I think there is a reason for this, but maybe not a good one.

1. Function application should be left to right, e.g. `sqrt 4`

2. Precedence order should be very simple. In k, everything has the same precedence order (with the exceptions of brackets)

1 + 2 forces you to have this right to left convention, annoyingly.

Fwiw, I think 2 is great and I would rather give up 1 than 2. However, writing function application as `my_fun arg` is a very strong convention.

cess11 · 1h ago

"Claude is aware of that, but it struggled to write correct code based on those rules"

It's actually not, and unless they in some way run a rule engine on top of their LLM SaaS stuff it seems far fetched to believe it adheres to rule sets in any way.

Local models confuse Python, Elixir, PHP and Bash when I've tried to use them for coding. They seem more stable for JS, but sometimes they slip out of that too.

Seems pretty contrived and desperate to invent transpilers from quasi-Python to other languages to try and find a software development use for LLM SaaS. Warnings about Lisp macros and other code rewrite tools ought to apply here as well. Plus, of course, the loss of 'notation as a tool of thought'.

strangescript · 49m ago

If your model is getting confused by python, its a bad model. Python is routinely the best language for all major models.

cess11 · 10m ago

I don't know what counts as a major model. Relevant to this, I've dabbled with Gemma, Qwen, Mistral, Llama, Granite and Phi models, mostly 3-14b varieties but also some larger ones on CPU on a machine that has 64 GB RAM.

rob_c · 1h ago

Same reason the same models don't fundamentally understand all languages. They're not trained to. Frankly the design changes to get this to work in training is minimal but this isn't the way English works so expect most of the corporate LLM to struggle because that's where the interest and money is.

Give it time until we have true globally multi lingual models for superior context awareness.

strangescript · 48m ago

A byte tokenized model is naturally 100% multi-lingual in all languages in its data set. There just isn't a lot of reason for teams to spend the extra training time to build that sort of model.

benjaminwootton · 59m ago

I just submitted a similar article about using LLMs (Gemini and Claude) to write SQL which I found to be very successful.

As ClickHouse (which I used in that test) is sometimes compared with Kdb+ I thought it was worth dropping a link here.

https://news.ycombinator.com/item?id=44509510

As I mention in the article, I tried this stuff a year or two ago and it was complex and flakey. With MCP servers and better reasoning models, being able to ask questions in natural language is just about crossing over to be viable IMO.

Why LLMs Can't Write Q/Kdb+: Writing Code Right-to-Left (medium.com)

Ruby 3.4 frozen string literals: What Rails developers need to know (prateekcodes.dev)

Serving a half billion requests per day with Rust and CGI (jacob.gold)

Is the doc bot docs, or not? (robinsloan.com)

Helm local code execution via a malicious chart (github.com)

Most RESTful APIs aren't really RESTful (florian-kraemer.net)

ESIM Security (security-explorations.com)

I Ported SAP to a 1976 CPU. It Wasn't That Slow (github.com)

Using MPC for Anonymous and Private DNA Analysis (vishakh.blog)

Bootstrapping a side project into a profitable seven-figure business (projectionlab.com)

7-Zip for Windows can now use more than 64 CPU threads for compression (7-zip.org)

US Court nullifies FTC requirement for click-to-cancel (arstechnica.com)

RapidRAW: A non-destructive and GPU-accelerated RAW image editor (github.com)

IKEA ditches Zigbee for Thread going all in on Matter smart homes (theverge.com)

Astro is a return to the fundamentals of the web (websmith.studio)

Phrase origin: Why do we "call" functions? (quuxplusone.github.io)

Breaking Git with a carriage return and cloning RCE (dgl.cx)

I'm Building LLM for Satellite Data EarthGPT.app (earthgpt.app)

Where can I see Hokusai's Great Wave today? (greatwavetoday.com)

Proposal: GUI-first, text-based mechanical CAD inspired by software engineering

Frame of preference A history of Mac settings, 1984–2004 (aresluna.org)

Supabase MCP can leak your entire SQL database (generalanalysis.com)

Smollm3: Smol, multilingual, long-context reasoner LLM (huggingface.co)

Bug Stories (500mile.email)

Libpostal: C library for parsing/normalizing street addresses around the world (github.com)

SUSE launches new European digital sovereignty service to meet surging demand (zdnet.com)

Zorin OS (zorin.com)

Radium Music Editor (users.notam02.no)

Show HN: I rewrote an outdated React Native map clustering library (github.com)

Brut: A New Web Framework for Ruby (naildrivin5.com)

iPod Linux (2017) (ipodlinux.org)

Privacy campaigners pour cold water on London cops 1k facial recognition arrests (theregister.com)

Taking over 60k spyware user accounts with SQL injection (ericdaigle.ca)

Show HN: OffChess – Offline chess puzzles app (offchess.com)

Springer Nature book on machine learning is full of made-up citations (retractionwatch.com)

Systemd has been a complete, utter, unmitigated success (blog.tjll.net)

Million Times Million (susam.net)

Swahili on the Road (historytoday.com)

Dynamical origin of Theia, the last giant impactor on Earth (arxiv.org)

4.6B Years On, the Sun Is Having a Moment (newyorker.com)

Xenharmlib: A music theory library that supports non-western harmonic systems (xenharmlib.readthedocs.io)

AI, power and sociolinguistics (2024) (researchgate.net)

The Capacity, Performance, and Reliability of MicroSD Cards (bahjeez.com)

Can an email go 500 miles in 2025? (flak.tedunangst.com)

Blind to Disruption – The CEOs Who Missed the Future (steveblank.com)

Surfing on a Matchbox (1999) (news.bbc.co.uk)

New Horizons images enable first test of interstellar navigation (newscientist.com)

Plants monitor the integrity of their barrier by sensing gas diffusion (nature.com)

GlobalFoundries to Acquire MIPS (mips.com)

Attimet (YC F24) – Quant Trading Research Lab – Is Hiring Founding Researcher (ycombinator.com)

Why LLMs Can't Write Q/Kdb+: Writing Code Right-to-Left

Comments (24)