Show HN: Tattoy – a text-based terminal compositor (tattoy.sh)
56 points by tombh 2h ago 13 comments
Show HN:I made a word translation plugin for language learning.
2 points by Mantaa 1d ago 0 comments
Gmail to SQLite
338 tehlike 114 5/10/2025, 4:25:43 AM github.com ↗
For example, you have recipients, subject, and sender as JSON fields, when you could have just a headers field with all of them, and even add the rest of the headers in the message.
If it's performance related, you can still have headers as a single json blob and then use generated columns for the specific fields.
For example
I've found this model really powerful, as it allows users to just alter table to add indexed generated columns as they need for their specific queries. For example, if I wanted to query dkim status, it's as simple as or whatever you want.I find it useful to create indexes like this, then create VIEWs using these expressions instead of ALTER'ing the main table with generated columns.
In general I prefer break out columns that I expect to have/use consistently, especially for something as stable as email headers. Maybe schema changes are a bit easier with a headers column, but imo its just trading the pain on write for pain on read (while leaving the door open to stuff failing silently).
I've found it really helpful to avoid the growing pains that come with "just shove it all in mongo", or "just put it on the file system", but not much cost.
https://github.com/terhechte/postsack
Is there a size option too? To see which senders are using most of my storage.
(Also your website's SSL certificate has expired.)
Funnily enough, the gmvault.org domain _that_ page points to is simply a parked GoDaddy placeholder. It's also not been updated in 10+ years except for two non-source files.
On the other hand, qdirstat "cache" files are really easy to generate so can be used for visualizing a bunch of file-like things
Especially as I receive more and more information that my freelance e-mail is put into spam by recipient systems.
Not sure how to get rid of my Google ecosystem routines, though. Feels daunting.
I set up Postfix to catch *@immibis.com. I use it for some less important things - first mailing list subscriptions, then I even used it to buy festival tickets. These are lower-risk things. If they don't work then it's not a big deal. In the latter case I'd be out $200 and not be able to go to that festival (which did actually happen, but not because of my email server, but because they tried to invent a hidden fee after I already paid, and I'll have to go to small claims court to get a refund). Now that I know it works, I use it by default for new less-important account signups. (And nobody's questioned me yet why the local-part of my email address is the name of their business)
I still wouldn't use *@immibis.com for my bank account. I'd use gmail for that. The bank is a corporation. If there's a problem between them and my email server, they'll tell me to suck it up, then delete my money. If there's a problem between them and Google's server they'll be forced to fix it. If there's a problem with my Google account, I can go to the bank office and say "Google banned me from Google, so I need to link to a different account" and they'll have a procedure for that. They won't have a procedure for "your mail server sends LF when it should be CRLF" or whatever weird issue could occur between them and a self-hosted mail server. But if my bank account was the last thing remaining on Google, in practice, it would still be a successful email de-googling. 99% is a pretty good success rate. The bank app runs on Android, anyway. Could switch banks and only do banking in person.
I find Youtube a good source of entertaining and informative content (certainly way better than something like Instagram) and I haven't replaced that yet.
After Mozilla jumped the shark and declared they hate privacy, I've been gradually moving things over to Zen Browser, which is based on Firefox. (I don't care that Zen isn't significantly more private than Firefox; I care there's someone in between me and Mozilla and that isn't Google)
you dont get it done by moping about it, but by doing
OAuth really doesn't. Every OAuth integration I've ever built always feels like it needs a tiny bit of custom development.
Also the OAuth flow is usually absolutely horrible for when you're trying to get a token for accessing your own data. I've had to spin up a temporary web app to handle a hunch of redirects just to get my own token!
Yes, you can do that, however the problem is getting a client_id/client_secret in the first place. You need to register yourself for one, you need to (nowadays) whitelist every single account or go through a google verification process. At one point you could apply for a client_id that allowed anyone to use it, but that ship has sailed.
Just to make sure the differences are clear: with username and password and IMAP I can use an RFC standardized protocol to sign into an inbox and I do not need Google's permission. The oauth flow they have is neither standardized (XOAUTH2 is not a standard as far as I know at least), requires provider specific logic (Outlook is different to Google) and most importantly requires me to get Google's permission to sign in. I need to get a client_id with the necessary scope, and that is only granted after a review by Google. [1]
[1]: asterisk is that a development only app can authenticate up to 100 users, and those users need to be explicitly whitelisted in the dev panel.
With an app password you have full IMAP access.
App passwords no longer exist on Google.
Here's the support article we link our users to:
https://support.google.com/accounts/answer/185833
Workspace is a bit different, however. You need an admin to enable app passwords.
Gmail to SQLite describes 6 steps to get credentials working, but it is not true for me. After 6 steps:
- that Google said that my app was not published, so I published it
- Google said that app cannot be internal, because I am not a workspace user
- for external apps
- then it said I cannot use the app until it is verified
- in verification they wanted to know domain, address, other details
- they wanted to have my justification for scopes
- they wanted to have video explaining how the app is going to be used
- they will take some time to verify the data I provided them
It all looks like a maze of settings, where requiring any of users to go above the hoops required by Google is simply too much.
Links:
[0] https://github.com/rumca-js/Django-link-archive
Does anybody have insight as to why it’s so bad?
When those inevitably get used for nefarious purposes; Google image suffers as a result.
Don't jump through their hoops.
- Open source
- Resume (so backups/restores will eventually complete)
Honorable mention: https://www.mailstore.com/en/products/mailstore-home/
- Not open source
- GUI with index: nice for searching mail locally
- Resume only for backup (so large restores generally fail)
https://github.com/gaubert/gmvault
For a long time and it's worked great. But it seems like GYB is actively maintained, so maybe I should switch.
(N.B. it can also process the mbox produced by https://takeout.google.com/ with the caveat that in some case Takeout permanently losses some information because of a bug with older encodings so I'd always prefer a backup using gmvault or imap)
I have a reminder to trigger this every few months and update my local backup. If I recall it comes as a gzipped mbox file.
IMAP is much harder, and much slower, and is bound by Google's bandwidth limits.
Maybe there have been times when it was broken or under high demand though?
So I can understand why using Google's proprietary API might work better (or not, I don't know)
Anyway, as a sibling says, nowadays Google Takeout includes mbox and work properly (and is pretty fast, like half a day), but doesn't allow continuous update.
And I migrated to another mail provider (infomaniak), and I've thanked myself for using my own mail domain name years earlier.
I installed a third-party client (Thunderbird, but I imagine any would work) on a local box, signed in with both emails, and just copied the mail over from one to the other. Low-tech, but it worked quite well. I may have forced some local cache/download for the original email, but I can't recall. I'll check later if it preserves headers and the like. I assume it would, but it wasn't that important to me.
I actually thought about writing at some point about the process of getting off gmail and all the funny things I ran across.
https://imapsync.lamiral.info/
https://marcoapp.io
The schema from AOX always looked really good to me, but I never have gotten to really giving it a try. I wanted to use it, primarily, to get analytics about my mail and for search (not a daily-driver IMAP server).
https://www.manitou-mail.org/
It is easy to fix though, since I can get Google Take Out (is that the name?) which I think is free and then parse file files once downloaded.
Still using this tool would be faster from a get it going perspective.
It is now syncing my messages, but very slowly. Some Async magic could probably be cool :)
Feature request: parse email content to extract unsubscribe links and allow me to unsubscribe from most frequent senders easily
I admittedly might just not have or understand the use case nor have I thought about how large a Gmail account actually is so feel free to ignore if I'm missing something!
- Searching a plain text data file is O(n). Searching a SQLite database that has been properly indexed, which is very easy to do nowadays with FTS5, is O(log n) worst case scenario and O(1) in the best case. This doesn't explain why SQLite over a dataframe or anything, but it definitely justifies it over plain text for large email collections.
- SQLite is really easy to write custom views and programs around. Virtually every major programming language can work with it without issue. See also: simonw's wonderful https://datasette.io/ .
- SQLite is an accepted archival format by the Library of Congress, if you ever want to go down the rabbit hole of digital preservation.
I think it's a big eml file.
To me having to install a tool through Python is a show-stopper.
[0] https://github.com/djipko/gbackup-rs
I have that tool activating once every 24h still, to this day.
* Google collects vast amounts of personal data, specifically through receiving all of your email and analyizing it.
* It builds elaborate user profiles and uses them to target you with ads designed to better influence you.
* Its hold on information (from different sources) has made it excessively powerful economically, and thus also politically.
* Google/Alphabet has long started to affect legislation, including through direct registered lobbying: ~15 Million USD in 2024 (opensecrets.org).
* It has been known to pass, and likely still passes, the information it collects - including copies of your email correspondence - on to the US government (Edward Snowden leaks).
and finally:
* There are multiple email providers, many of them quite good - both for pay and gratis. Naturally most of the gratis ones have their own interests, but nothing like Google.
Edit: You can create groups of people that are not affected by Google/Apple/Facebook, this should be seen as a goal.
Anyway, among the popular providers, I am partial to ProtonMail:
https://en.wikipedia.org/wiki/Proton_Mail
it's Swiss, run by a non-profit, and originally crowd-funded. Over 100 Million users, IIANM. I am not a cyber-security expert, so I can't claim to have audited them for security or privacy bona fides (but maybe somebody has?).
I've also heard suggestions to try Tuta (tuta-nota), but have never tried it myself.
There are also many smaller providers. Specifically, many Internet access providers also provider email services. Not that they are super trustworthy, but - there's a good chance you're not just hading everything over to one of the behemoths like Google or Microsoft.
If you do end up on one of the bigger providers - it's probably best to be on one that's not linked to the government where you live. So if you're in a NATO country you could go for yandex or mail.ru and if in Russia then maybe GMX?
Unfortunately - wherever you take your email - when you write a GMail address, Alphabet has a hold of your correspondence again. So, we need to convince our friends to ditch Google as well.
A problem with desktop clients is that they are usually generic IMAP clients, meaning they have limited support for server-side search, non-standard features like labels, and creating server-side filters.
$4/m seems nominal for a 50GB mailbox with no weird adtech shit built in.
Gmail != Google
WorkMail != Amazon
Gmail is targeted at consumers and is engineered to suck up your data to pay for itself. WorkMail is targeted at businesses and is engineered to not piss off IT administrators and middle management.
I could also pay for a Google Workspace and stay with Gmail.
Also, hosting email under your own domain gives you the freedom to move from one email provider to another even if they do shut down.
I put my money where my mouth is. I wanted to degoogle and so pay $50/year for Fastmail. One feature I like is automatically snoozing certain emails. Most of my non-personal email is automatically snoozed until 6pm every day. This way I don't get multiple notifications throughout the day for emails that aren't time sensitive.
No comments yet