(except I have it aliased to "jq-structure" locally of course. also, if there's a new fancy way to do this, I'm all ears; I've been using this alias for like... almost a decade now :/)
In the spirit of trying out jqfmt, let's see how it formats that one-liner...
Not bad! Shame that jqfmt doesn't output a newline at the end, though. The errant `%` is zsh's partial line marker. Also, `-ob -ar -op pipe` seems like a pretty good set of defaults to me - I would prefer that over it (seemingly?) not doing anything with no flags. (At least for this sample snippet.)
naniwaduni · 3h ago
For small problem sizes, you can get a nontrivial improvement by moving the unique up ahead of all the string manipulation:
jq -r '[path(..)|map(if type=="number" then "[]" end)]|unique[]|join(".")/".[]"|"."+join("[]")'
For larger problem sizes, you might enjoy this approach to avoid generating the array of all paths as an intermediate, instead producing a deduped shadow structure as you go along:
jq -rn --stream 'reduce (inputs|select(.[1])[0]|map(if type=="number" then "[]" end)) as $_ (.; setpath($_; 1))|path(..)|join(".")/".[]"|"."+join("[]")'
(Note that in either case, you still run yourself into a bit of trouble with fields named "[]", as well as field names with "." in them. I assume this is not a serious issue, since you're only ever looking at this interactively.)
crashabr · 4h ago
I'm a long time user of this snippet as well. I discovered fastgron [0] last year and found it convenient for some situations!
Not anywhere near as sophisticated as yours but I have something vaguely similar for simplifying JSON documents (while maintaining what the data also looks like) for feeding to LLMs to help them code against:
jq 'walk(if type == "array" then (if length > 0 then [.[0]] else . end) else . end)'
So that 70,000+ line Amazon example of yours would boil down to:
.. which is easier/cheaper to feed to an LLM for getting it to write code to process, etc. than the multi-megabyte original.
rwiggins · 5h ago
Oh wow, that's fantastic. I love that it includes real values while still summarizing the doc's structure. I'm going to steal that. I'll probably keep jq-structure around because it's so easy to copy/paste paths I'm looking for, but yours is definitely better for understanding what the JSON doc actually contains.
naniwaduni · 3h ago
Got a bit nerd-sniped here, but first of all we can reduce if A then B else . end === if A then B end since jq 1.7:
jq 'walk(if type == "array" then (if length > 0 then [.[0]] end) end)'
Now we could contract those conditionals:
jq 'walk(if type == "array" and length > 0 then [.[0]] end)'
but it turns out we can even more usefully express if length > 0 then [.[0]] end === [limit(1; .[])] == .[:1]:
jq 'walk(if type == "array" then .[:1] end)'
From here, we can golf it a little further (this is kind of a generic type-matching pattern):
jq 'walk(arrays[:1] // .)'
although this does incur a bit more overhead than checking type directly.
Speaking of overhead, though, it turns out that the implementation of walk/1 (https://github.com/jqlang/jq/blob/master/src/builtin.jq#L212) will actually run the filter on every element of an array, even though we're about to throw most of them out, which we can eliminate by writing the recursion explicitly:
jq 'def w: if type=="array" then [limit(1; .[]|w)] elif type=="object" then .[] |= w end; w'
which gets the operation down from ~200 ms on my machine (not long enough to really get distracted, but enough to feel the wait) to a perceptually instant ~40 ms (which is mostly just the cost of reading the input). Now we can golf it down a little more:
jq 'def w: if type=="array" then [limit(1; .[]|w)] else objects[] |= w end; w'
jq 'def w: (arrays[:1]|map(w)) // (objects[] |= w); w'
(the precedence here actually allows us to eliminate the parens here...)
jq 'def w: arrays |= .[:1]|iterables[] |= w; w'
And, inaccessibility of the syntax aside, I think this does an incredible job of expressing the essence of what we're trying to do: we trim any array down to its first element, and then recursively apply the same transformation throughout the structure. jq is a very expressive language, it just looks like line noise...
Bluestein · 2h ago
Hat off.-
PS. Also, if I may l, thanks for the walkthrough - I'd be clapping with just the short form at the end, but the reasoning is appreciated.-
jzelinskie · 6h ago
This is an incredibly useful one-liner. Thank you for sharing!
I'm a big fan of jq, having written my own jq wrapper that supports multiple formats (github.com/jzelinskie/faq), but these days I find myself more quickly reaching for Python when I get any amount of complexity. Being able to use uv scripts in Python has considerably lowered the bar for me to use it for scripting.
Where are you drawing the line?
rwiggins · 5h ago
Hmm. I stick to jq for basically any JSON -> JSON transformation or summarization (field extraction, renaming, etc.). Perhaps I should switch to scripts more. uv is... such a game changer for Python, I don't think I've internalized it yet!
But as an example of about where I'd stop using jq/shell scripting and switch to an actual program... we have a service that has task queues. The number of queues for an endpoint is variable, but enumerable via `GET /queues` (I'm simplifying here of course), which returns e.g. `[0, 1, 2]`. There was a bug where certain tasks would get stuck in a non-terminal state, blocking one of those queues. So, I wanted a simple little snippet to find, for each queue, (1) which task is currently executing and (2) how many tasks are enqueued. It ended up vaguely looking like:
I think this is roughly where I'd start to consider "hmm, maybe a proper script would do this better". I bet the equivalent Python is much easier to read and probably not much longer.
Although, I think this example demonstrates how I typically use jq, which is like a little multitool. I don't usually write really complicated jq.
dotancohen · 3h ago
I could Google it, but tell a bit more about uv scripts. Isn't uv a package manager like pip?
easton · 1h ago
uv has a feature where you can put a magic comment at the top of a script and it will pull all the dependencies into its central store when you do “uv run …”. And then it makes a special venv too I think? That part’s cloudier.
May I also add this ain't a mere one liner. It's a masterclass!
jdc0589 · 5h ago
this is a super useful oneliner, immediately saved to my bash profile as `jqstructure`
Hendrikto · 6h ago
> Side note: Ever tried Googling for "jq formatter"? Reading search results is a nightmare since jq itself is, among other things, a formatter.
That’s what I thought too, when I read the title. To clarify: This tool formats jq commands, not JSON itself.
vanschelven · 6h ago
Which makes sense because jq, with no options, acts as a formatter by default. (it's about 50% of my jq usage).
layer8 · 6h ago
While it doesn’t help much for search in this case, the more specific term is “pretty-printer”.
s17n · 5h ago
If you need to format your one-liner, maybe it shouldn't be a one liner?
Anyway whether or not this tool is advisable its definitely cool, nice work!
noperator · 3h ago
My prototype one-liners usually turn into Go programs :)
Bluestein · 3h ago
Sic semper :)
Bluestein · 4h ago
> If you need to format your one-liner, maybe it shouldn't be a one liner?
Entirely correct, this point.-
PS. May I also appreciate your comment, as far as form? You made both, valid, points.-
kiitos · 5h ago
Instead of making users enable every formatting rule explicitly e.g.
jqfmt -ob -ar -op pipe
It would be better if the tool enabled a common set of rules by default, so that `echo ... | jqfmt` actually did something useful :)
xmonkee · 5h ago
God I really abhor jq and it seems it's becoming a standard. I dislike it cause I'm too dumb to correctly dredge up it's incantations, and once a year I have to go reading their arcane docs. I suppose it's another fertile ground for LLM use.
mdaniel · 5h ago
The bad news is that much like how "I'm just going to DSL this ..." inevitably morphs into a full-blown programming language[1], so too is the ubiquitous "gah, your language is too complex, I'm going to just use this other tool that implements my favorite 10% of the cases"
which is a long way of saying: or else what? There's 100% no way that I'm going to ever, ever use <<python3 -c "import json, sys; print(json.load(sys.stdin)[...ohgawd...]">> and if you are, then more power to ya and jq apparently doesn't solve a problem you have
It's a pretty good on/off-ramp into better tools. Going from arbitrary slop to something that's a reasonable input to `nixlang` or Dhall is pure win IMHO.
I get a lot of use out of `jq` even though I prefer sounder systems than JSON.
pxc · 4h ago
What would "non-arcane" jq docs look like? I'm kind of in the same boat, being an infrequent jq user, but I've generally found the docs pretty easy to navigate.
ashwinsundar · 4h ago
A standard for what? It just makes JSON look nicer and more query-able. You don't have to use it.
xmonkee · 3h ago
A standard as in there is a cottage industry of tools and websites built around it now, like this one.
lxgr · 1h ago
Given the choice between a hypothetical standard that nobody wrote (or implemented) and a tool that organically grew complex enough to benefit from a standard, I'd rather have the latter.
Users (i.e. not implementors) usually also don't read the standard – they read the docs (ideally containing lots of examples on top of a dry enumeration of options), or today indeed ask an LLM.
quotemstr · 1h ago
Hey. Don't hate on jq too much. It's a backdoor way to get functional programming past people's mental perceived complexity forcefields.
mikeocool · 4h ago
Been using fx (fx.wtf) as alternative to jq recently.
Give you a nice javascript interface to do similar types of processing to what I would do with jq.
jq is sed for json data.
gofmt is a GO source formatter
jqfmt is like gofmt a go source formatter, but for json.
So jqfmt is really json beautifier...
Anyone with an ASR-33 for sale? rq?
quotemstr · 1h ago
jq is convenient, but I don't see the draw in building data processing pipelines on it. It's like writing complex software in shell.
Recently, I found myself wanting to do a join by filename on two sets of about 300,000 files. Tried bashing my head against jq with INDEX and various tricks and couldn't get the runtime below minutes.
Then I just gave up, fired up Python, loaded the dataset into Pandas, and did a join. Completed too fast to notice.
I'll use this opportunity to plug the one-liner I use all the time, which summarizes the "structure" of a doc in a jq-able way: https://github.com/stedolan/jq/issues/243#issuecomment-48470... (I didn't write it, I'm just a happy user)
For example:
(except I have it aliased to "jq-structure" locally of course. also, if there's a new fancy way to do this, I'm all ears; I've been using this alias for like... almost a decade now :/)In the spirit of trying out jqfmt, let's see how it formats that one-liner...
Not bad! Shame that jqfmt doesn't output a newline at the end, though. The errant `%` is zsh's partial line marker. Also, `-ob -ar -op pipe` seems like a pretty good set of defaults to me - I would prefer that over it (seemingly?) not doing anything with no flags. (At least for this sample snippet.)[0] https://github.com/adamritter/fastgron
Speaking of overhead, though, it turns out that the implementation of walk/1 (https://github.com/jqlang/jq/blob/master/src/builtin.jq#L212) will actually run the filter on every element of an array, even though we're about to throw most of them out, which we can eliminate by writing the recursion explicitly:
which gets the operation down from ~200 ms on my machine (not long enough to really get distracted, but enough to feel the wait) to a perceptually instant ~40 ms (which is mostly just the cost of reading the input). Now we can golf it down a little more: (the precedence here actually allows us to eliminate the parens here...) And, inaccessibility of the syntax aside, I think this does an incredible job of expressing the essence of what we're trying to do: we trim any array down to its first element, and then recursively apply the same transformation throughout the structure. jq is a very expressive language, it just looks like line noise...PS. Also, if I may l, thanks for the walkthrough - I'd be clapping with just the short form at the end, but the reasoning is appreciated.-
I'm a big fan of jq, having written my own jq wrapper that supports multiple formats (github.com/jzelinskie/faq), but these days I find myself more quickly reaching for Python when I get any amount of complexity. Being able to use uv scripts in Python has considerably lowered the bar for me to use it for scripting.
Where are you drawing the line?
But as an example of about where I'd stop using jq/shell scripting and switch to an actual program... we have a service that has task queues. The number of queues for an endpoint is variable, but enumerable via `GET /queues` (I'm simplifying here of course), which returns e.g. `[0, 1, 2]`. There was a bug where certain tasks would get stuck in a non-terminal state, blocking one of those queues. So, I wanted a simple little snippet to find, for each queue, (1) which task is currently executing and (2) how many tasks are enqueued. It ended up vaguely looking like:
which ends up producing output like (assuming queue 0 was blocked) I think this is roughly where I'd start to consider "hmm, maybe a proper script would do this better". I bet the equivalent Python is much easier to read and probably not much longer.Although, I think this example demonstrates how I typically use jq, which is like a little multitool. I don't usually write really complicated jq.
https://docs.astral.sh/uv/guides/scripts/
Makes it a snap to have a one file python script without having to explicitly pip install requests or whatever into a venv.
That’s what I thought too, when I read the title. To clarify: This tool formats jq commands, not JSON itself.
Anyway whether or not this tool is advisable its definitely cool, nice work!
Entirely correct, this point.-
PS. May I also appreciate your comment, as far as form? You made both, valid, points.-
which is a long way of saying: or else what? There's 100% no way that I'm going to ever, ever use <<python3 -c "import json, sys; print(json.load(sys.stdin)[...ohgawd...]">> and if you are, then more power to ya and jq apparently doesn't solve a problem you have
1: https://www.laws-of-software.com/laws/zawinski/
I get a lot of use out of `jq` even though I prefer sounder systems than JSON.
Users (i.e. not implementors) usually also don't read the standard – they read the docs (ideally containing lots of examples on top of a dry enumeration of options), or today indeed ask an LLM.
Give you a nice javascript interface to do similar types of processing to what I would do with jq.
PS. Honestly, it's pretty close.-
https://github.com/noperator/sol
I actually wrote jqfmt because I needed it for sol :)
PS. Happily, featured here:
- https://news.ycombinator.com/item?id=41556088
jq is sed for json data. gofmt is a GO source formatter jqfmt is like gofmt a go source formatter, but for json. So jqfmt is really json beautifier...
Anyone with an ASR-33 for sale? rq?
Recently, I found myself wanting to do a join by filename on two sets of about 300,000 files. Tried bashing my head against jq with INDEX and various tricks and couldn't get the runtime below minutes.
Then I just gave up, fired up Python, loaded the dataset into Pandas, and did a join. Completed too fast to notice.