Many ransomware strains will abort if they detect a Russian keyboard installed (2021) (krebsonsecurity.com)

how deterministic is the emit really. if i feed same expression tree twice,same node layout same captures. do i get exact same bytes out every time (ignoring reloc) or not. if output produced is byte stable across runs for same input graph ,that opens up memoized JIT paths.worth checking if current impl already does this or needs a pass to normalise alloc order

jdnend · 9h ago

Why wouldn't it be deterministic?

xnacly · 7h ago

Several possible reasons: - parallelism - concurrent machine code gen - different optimisations for different runs, producing differing machine code order, etc

smartaz42 · 2h ago

FWIW for C only I've used libtcc repo.or.cz/w/tinycc.git with great success. The API is a joy, as we all expect from a Bellard project. It focuses on compilation speed, the generated code is not at all optimized.

kookamamie · 7h ago

> auto & rsquared = expression.Mul(expression.GetP1(), expression.GetP1());

This is C++, no? Why not use operator overloading for the project?

dontlaugh · 4h ago

I think they didn't find it useful.

They built this to translate a search query that is only known at runtime. Presumably they already have an AST or similar, so calling methods as it is being walked isn't any harder than operators.

plq · 6h ago

This line is part of the code that creates an AST-like structure that is then fed into the compiler. The actual multiplication is done by calling the function handle returned from the Compile method.

whizzter · 3h ago

I think what GP was referring to that there is nothing stopping the code from being designed so that:

AST<float> p1 = exp.GetP1();

AST<float> rsqr = p1 * p1; // AST<float> implements an operator* overload that produces an AST<float> object

Even if many frown upon operator overloading due to the annoying magical-ness of the standard-librarys appropriation of the shift operators for "piping" (<< and >>), it's still what makes many people prefer C++ for vector-math,etc tasks.

So whilst the result isn't a direct multiplication it should still be an acceptable use since the resulting code will actually be doing multiplications.

Considering what goes on under the hood however, I guess there might be some compiler optimization reasons to keep everything in the expression object in the example as the holder of data instead of spread out in an allocation tree with lots of pointer-chasing.

plq · 1h ago

> So whilst the result isn't a direct multiplication it should still be an acceptable use since the resulting code will actually be doing multiplications.

First, nope, if it's not multiplication it should not be using the * operator, period. Operator overloading is already overused and it leads to so much problematic code that looks fine to the untrained eye (string concat using operator+ being a famous example).

That said, you may also want to pass more options to the Mul node in the future and operator*() can only accept two arguments.

As another example, run the following Python code to see how python represents its own AST:

    import ast;print(ast.dump(ast.parse("2*PI*r*r"),indent=2))

OskarS · 5h ago

Yes, but what I suspect the commenter was saying is that you can build the expression usung operator overloading as well, so you can type ”a + b”, not ”a.Add(b)”.

I love it when libraries like this do that. z3 in python is similar, you just build your constraints using normal syntax and it all just works. Great use of operator overloading.

BearOso · 56m ago

Except that's not what's happening. expression.Mul isnt multiplying itself against something, it's adding a Mul instruction to its list. Maybe it would have been more obvious if the method name was insertMul instead.

kookamamie · 4h ago

Yes, exactly. See Eigen as an example.

anon-3988 · 10h ago

Interesting, this is very similar to llvmlite.Builder which is a wrapper over llvm. I am probably going to create something similar for my Python -> C -> assembly JIT.

lhames · 6h ago

The LLVM ORC and Clang-REPL projects would be worth checking out if you haven't already: there's a healthy community of high performance computing folks working in this space over at https://compiler-research.org.

In particular, this talk might be interesting:

"Unlocking the Power of C++ as a Service: Uniting Python's Usability with C++'s Performance"

Video: https://www.youtube.com/watch?v=rdfBnGjyFrc Slides: https://llvm.org/devmtg/2023-10/slides/techtalks/Vassilev-Un...

Twirrim · 9h ago

There's also libgccjit, https://gcc.gnu.org/wiki/JIT, though all of the third party language bindings appear to be stale for it.

globalnode · 9h ago

that project sounds interesting as well, but what do you do with libraries in python.. have the generated C code translate back to python calls?

anon-3988 · 9h ago

The point is not to compile entire Python programs, the point is to optimize specific parts of Python that matters. To illustrate, consider a calculating sum of 1 to N in python

def sum(N): x = 0 for i in range(N): x += i return x

There's absolute zero reason why this code has to involve pushing and popping stuff on the python virtual stack. This should be compiled into assembly with a small conversion between C/PyObject.

The goal is to get to a point where we can even do non-trivial things inside this optimized context.

Python will never be able to go down to assembly because Python support doing "weird shit" like dynamically creating modules, hell, even creating a Python file, running eval on that, and loading it as a new module. How are you even going to transpile that to assembly?

So I approach the problem the same way numba is approaching. But hopefully more modern and simpler (implementation wise). Planning on doing it using Rust and the backend should be agnostic (GCC, Clang, whatever C compiler there is)

hayley-patton · 2h ago

> "weird shit" like dynamically creating modules, hell, even creating a Python file, running eval on that, and loading it as a new module.

Expect that you don't, and deoptimise when you do: https://bibliography.selflanguage.org/_static/dynamic-deopti...

It's really not that impossible.

izabera · 6h ago

this looks convenient to use from c++, but the example code it generates is rather suboptimal (see https://godbolt.org/z/3rWceeYoW in which no normal compiler would set up and tear down a stack frame for that) so i'm guessing there isn't any support for optimisations? what's the advantage of this over just compiling + calling dlopen/LoadLibrary on the result?

whizzter · 2h ago

For simple functions an C compiler will generate code that is perhaps 50% faster than this standard prologue/epilogue (modern CPU's eat up most of the "bloat" whereas the branch to _any_ function will cause some branch predictor pressure), as soon as the function grows the gains will be smaller as long as the code runs somewhat in a straight line and isn't subject to cache misses.

Compared to even an optimized interpreter this will be somewhere between 4x to 20x faster (mainly due to having far far smaller branch predictor costs), so even if it doesn't generate optimal code it will still be within an magnitude of optimal native code whereas an interpreter will be much further behind.

dlopen/LoadLibrary,etc comes with far more memory pressure and OS bookkeeping.

rurban · 4h ago

I guess for the first function call not, but subsequent calls yes. They claim that register optimizations are properly done.

nurettin · 8h ago

It really sounds like a job for Java (Microsoft, I know, I know.)

whizzter · 2h ago

Having written small compilers or other code-generators targeting both the JVM and .NET runtimes, i can say that the .NET equivalents have some extra simple options for scenarios like this.

Both have always supported building libraries/assemblies and loading them (the ASM library+custom classloaders for Java and AssemblyBuilder in .NET are about equally capable).

However .NET also has DynamicMethod that is specifically built to quickly build just small functions that aren't tied to larger contexts (similar API to asm/assemblybuilder).

But an even easier option for stuff exactly like in the article that people don't widely really seem to know about is that Linq (yes that "sql-like" stuff) actually contains parts for deferred method building that can be easily leveraged to quickly produce customized native code methods.

The neat part about the Linq-code generator is that you can just drop in plain C# code-blocks to be translataed by the C# compiler into Linq snippets and then with some helpers transform everything to Linq-tree's that can then be compiled.

The benefit over Asm/AssemblyBuilder/DynamicMethod is that Linq nodes are basically an built-in AST that can be directly leveraged whereas the other API's requires some mapping of your own AST's to the respective stack-machines.

https://asm.ow2.io/

https://learn.microsoft.com/en-us/dotnet/api/system.reflecti...

https://learn.microsoft.com/en-us/dotnet/api/system.linq.exp...

pjmlp · 2h ago

Usual remark regarding the age of bytecode systems and JIT (aka dynamic compilation), predating Java.

adwn · 7h ago

> It really sounds like a job for Java

Why?

dontlaugh · 4h ago

Because the JVM's JIT does already specialise based on runtime values.

nurettin · 2h ago

Hotspot (TM) JIT compiles java code to machine code when it detects hot code paths, speeding up execution during runtime, exactly the use case described in the article.

adwn · 37m ago

Doesn't Hotspot have notoriously long warm-up times? Have those been exaggerated, or have they recently improved?

If you know beforehand that you'll execute some piece of code many times, the most efficient approach is to JIT-compile it right away, and not only after a lot of time has passed.

Show HN: I'm an airline pilot – I built interactive graphs/globes of my flights (jameshard.ing)

Gemini CLI (blog.google)

IDF officers ordered to fire at unarmed crowds near Gaza food distribution sites (haaretz.com)

More on Apple's Trust-Eroding 'F1 the Movie' Wallet Ad (daringfireball.net)

JavaScript Trademark Update (deno.com)

Writing toy software is a joy (blog.jsbarretto.com)

MCP: An (Accidentally) Universal Plugin System (worksonmymachine.substack.com)

OpenAI charges by the minute, so speed up your audio (george.mand.is)

Engineered Addictions (masonyarbrough.substack.com)

A new PNG spec (programmax.net)

A new pyramid-like shape always lands the same side up (quantamagazine.org)

Fun with uv and PEP 723 (cottongeeks.com)

I made my VM think it has a CPU fan (wbenny.github.io)

Man 'refused entry into US' as border control catch him with bald JD Vance meme (dublinlive.ie)

Thnickels (thick-coins.net)

A new PNG spec (programmax.net)

I deleted my second brain (joanwestenberg.com)

Define policy forbidding use of AI code generators (github.com)

-2000 Lines of code (2004) (folklore.org)

AlphaGenome: AI for better understanding the genome (deepmind.google)

Facebook is asking to use Meta AI on photos you haven’t yet shared (theverge.com)

What Problems to Solve (1966) (genius.cat-v.org)

Microsoft Edit (github.com)

Starship: A minimal, fast, and customizable prompt for any shell (starship.rs)

PlasticList – Plastic Levels in Foods (plasticlist.org)

Games run faster on SteamOS than Windows 11, Ars testing finds (arstechnica.com)

US Supreme Court limits federal judges' power to block Trump orders (theguardian.com)

Introducing Gemma 3n (developers.googleblog.com)

Finding a 27-year-old easter egg in the Power Mac G3 ROM (downtowndougbrown.com)

Alternative Layout System (alternativelayoutsystem.com)

XSLT – Native, zero-config build system for the Web (github.com)

Puerto Rico's Solar Microgrids Beat Blackout (spectrum.ieee.org)

US economy shrank 0.5% in the first quarter, worse than earlier estimates (apnews.com)

Many ransomware strains will abort if they detect a Russian keyboard installed (2021) (krebsonsecurity.com)

Basic Facts about GPUs (damek.github.io)

JWST reveals its first direct image discovery of an exoplanet (smithsonianmag.com)

Getting ready to issue IP address certificates (community.letsencrypt.org)

Build and Host AI-Powered Apps with Claude – No Deployment Needed (anthropic.com)

Show HN: Octelium – FOSS Alternative to Teleport, Cloudflare, Tailscale, Ngrok (github.com)

Launch HN: Issen (YC F24) – Personal AI language tutor

Loss of key US satellite data could send hurricane forecasting back 'decades' (theguardian.com)

ChatGPT's enterprise success against Copilot fuels OpenAI/Microsoft rivalry (bloomberg.com)

We ran a Unix-like OS on our home-built CPU with a home-built C compiler (2020) (fuel.edby.coffee)

National Archives at College Park, MD, will become a restricted federal facility (archives.gov)

The bitter lesson is coming for tokenization (lucalp.dev)

Why is the Rust compiler so slow? (sharnoff.io)

Libxml2's "no security embargoes" policy (lwn.net)

Solving `Passport Application` with Haskell (jameshaydon.github.io)

US Defense Department will stop providing satellite weather data (text.npr.org)

Reading NFC Passport Chips in Linux (shkspr.mobi)

NativeJIT: A C++ expression –> x64 JIT (2018)

Comments (29)