RybbitL Open source Google Analytics replacement

362 samdung 164 5/7/2025, 5:45:33 PM github.com ↗

Comments (164)

nm980 · 31d ago
The market for Google Analytics alternatives is crowded. There's Plausible, Ahrefs web analytics, onedollarstats.com, PostHog, Matomo, Unami, Grafana, Microsoft Clarity (free at any scale), and so many others. Despite minor differences these products all compete for the same users (e.g. if someone is a PostHog customer they probably won't be using Ahref web analytics) yet most of these companies offer generous free tiers while rybbit only a free trial.

How do products like rybbit.io stay competitive without a similar free tier or major differentiation? Is rybbit generating revenue for its hosted plan?

openplatypus · 31d ago
As a founder in this space, it not as bad as you think. There are niches in this crowded yet broad space.

Plausible - good for self-hosting, but their SaaS is very expensive and FOSS vs SaaS offering differ.

Ahrefs - they will use your traffic to improve your competitor research, you really should use them cautiously.

Matomo - feature rich but can be overwhelming.

Posthog - its SaaS is US based so dismissed early by EU customers.

Clarity, like GA has serious privacy issues.

Our product, Wide Angle Analytics, has its own gotchas compared to competitors - its opinionated and there are folks who do not agree with our opinions, but the landscape of websites is so vast that you find your client nevertheless.

That said, we are still in business after 4 years, and we saw few competitors disappear or get acquired and extinguished.

So, all the best to the OP. Hope you find your niche :)

rmonvfer · 31d ago
Just for the record, I use PostHog in my Europe-based startup. They have a EU region so it’s not a problem for us.
teddyh · 31d ago
There are some indications that this may soon no longer be sufficient.
openplatypus · 31d ago
It is US company. Does not matter where server resides. If server is in EU but under US jurisdiction (US company) that can be (needs checking) treated as International Data Transfer.
meander_water · 31d ago
There's a bunch more listed here as well https://github.com/oxnr/awesome-analytics
nm980 · 31d ago
What's your sales strategy? Is cold calling companies with google analytics installed on their websites more effective than the blog? Have you been able to retain Next.js users after Vercel released Web Analytics?

No comments yet

gregjw · 31d ago
Surprised no ones talked about Fathom Analytics. My alternative of choice.
herpdyderp · 31d ago
Why do you prefer Fathom?
flashblaze · 31d ago
I'm using Clarity and was under the impression it was better than GA
rossjudson · 31d ago
What are the privacy issues with GA and Clarity?
CGMthrowaway · 31d ago
What about Amplitude?
bill_yang · 31d ago
This is pretty spot on. There's a couple of dimensions the major players sit on, and there's enough combinations that there's plenty of space for smaller players to survive in.

I'm not super familiar with all of these products, so some of these ratings will be based on vibes

1-----------------10

OSS <-> Proprietary

Small business <-> Enterprise

Simplicity <-> Complexity

Web analytics <-> Product analytics

Privacy <-> No privacy

# Rybbit (me) - just launched $0

OSS/Proprietary - 2

I use AGPL 3.0 which isn't as permissive as MIT

Small business/Enterprise - 5

I definitely want enterprises to use Rybbit, but it's hard to target them at this stage

Simplicity/Complexity - 6.5

I think Rybbit is going to end up as one of the more feature-rich OS analytics tools, but I hope it stays easy to use (famous last words)

Web analytics/Product analytics - 4

Want to target both eventually, but my product analytics is weaker relatively

Privacy/No privacy - 3

Can be as GDPR compliant as others, but can also be configured to be a bit more invasive

# Posthog - ~15M ARR

OSS/Proprietary - 4

Have a bunch of enterprise licensed parts of their repo and they tell people in their docs to not self-host it because it's too difficult.

Small business/Enterprise - 8

Seems like they hook startups in with generous free tiers and then milk the unicorns that come out

Simplicity/Complexity - 10

The scope of Posthog is awe inspiring. They are literally 10 startups in 1

Web analytics/Product analytics - 8

I believe product analytics was their first feature

Privacy/No privacy - 7

I think they use cookies?

# Google Analytics

OSS/Proprietary - 10

Small business/Enterprise - 9

Free for everyone but it's clear they don't care about regular users that want to track their small site

Simplicity/Complexity - 8

If there was a dimension for usability it would be 11/10 totally unusable

Web analytics/Product analytics - 6

Not too sure about this one

Privacy/No privacy - 9

i mean it's google

# Mixpanel - $200m ARR

I'm the least familiar with this one

OSS/Proprietary - 9

Small business/Enterprise - 8

Simplicity/Complexity - 8

Web analytics/Product analytics - 9

Privacy/No privacy - 7

# Umami - unknown ARR (maybe 500K?)

OSS/Proprietary - 1

MIT license, no enterprise only features from what I see

Small business/Enterprise - 5

Seem to have some big names on their site

Simplicity/Complexity - 4

Web analytics/Product analytics - 5

Privacy/No privacy - 5 They claim GDPR compliance but I've self hosted it and they clearly fingerprint users without any obvious opt out.

# Plausible - ~2m ARR

OSS/Proprietary - 4

AGPL v3 and some a some enterprise features the community version doesn't have. Also they use Elixir so i doubt anyone actually reads it/s

Small business/Enterprise - 6

Have to be selling to enterprises with that ARR

Simplicity/Complexity - 3

Tool is very simple at the surface, but there's a lot of config options under the hood

Web analytics/Product analytics - 3

Mostly just web analytics

Privacy/No privacy - 2

This is a big focus for them

# Simple Analytics ~500k ARR

OSS/Proprietary - 8

Closed source, but they are an open startup that shares their financials

Small business/Enterprise - 3

They show some big names, but the creator is an indie hacker

Simplicity/Complexity - 2

Self explanatory

Web analytics/Product analytics - 2

Privacy/No privacy - 2

Very GDPR compliance focused

If this was a multi-dimensional vector, I'm trying to fill the space between something like Posthog and Plausible, where we are as open source as either of them and fill the missing space between extreme simplicity and extreme complexity.

GordonS · 31d ago
This really does look like a great project!

Is it possible to use it server-side only, with no JavaScript required? I currently use Umami like that - it has an API, so I can send it page view events and custom events from server-side code. That means analytics can't be disabled by uBlock or the like, or by disabling JavaScript.

withinboredom · 31d ago
I also have my own analytics service on serverside, and it is sooo vastly different from the client-side analytics. The client side only sees ~5-10% of what I see on the server side -- even after filtering out bots and the like.
bill_yang · 31d ago
I'm going to add an API soon!
stuartjohnson12 · 31d ago
> Posthog - its SaaS is US based so dismissed early by EU customers.

Posthog has had an EU server for years. I'm not sure what you mean by this.

withinboredom · 31d ago
An EU server isn't enough. Those EU servers should be operated and maintained by an EU subsidy that licenses the tech from the US company. In other words, even if the US company wanted the data served by the EU company, they couldn't get to it.
openplatypus · 31d ago
It is US company. It does not matter where the servers are physically located.
_heimdall · 31d ago
It would be quite funny to reply to an NSA request by saying "Oh yeah, I have that data and can access it but its on a server in the EU. I can get it, but I won't."

I genuinely don't know how they would proceed, but it'd be interesting to watch.

guappa · 31d ago
Probably randomly find 3kg of cocaine on your person.
jillyboel · 31d ago
Some not-so-friendly men in suits will show up at your house. Afterwards you will most certainly comply.
tonyhart7 · 31d ago
I mean if you can even refuse that, this is national agency we talking about

They can physically tap global internet cable just because they can

tonyhart7 · 31d ago
ok if that possible, what makes EU agency not doing the same to EU base company then???
danielheath · 31d ago
EU doesn’t have that kind of agency.

Member states have spy agencies, but they also signed treaties to join the EU. Having your spy agency violate international treaties isn’t something most governments allow.

blitzar · 31d ago
If the company or individual is in a country, expect they can be compelled to hand over everything in their possession by a court order (or it can be siezed).

If the information is stored in a country, expect that the owner of the information can be compelled to hand it over by a court order (or it can be seized).

__m · 31d ago
The EU doesn't have something like the CLOUD Act, so they wouldn't be able to do that.
_heimdall · 31d ago
Well the EU absolutely could do the same thing. If an EU-based company had servers in the US, I would expect the EU could compel them to hand over data despite where the data is stored.
bjelkeman-again · 31d ago
The EU doesn’t do law enforcement. That is a national concern. But a legal request from one country to another may well happen.
_heimdall · 31d ago
I misspoke there, I was meaning Europe and shouldn't have put EU.

Agreed, the union doesn't have any enforcement mechanism I'm aware of that would fit, but any country in Europe could do a similar thing to companies based in their borders.

tom_y · 31d ago
Niche markets always allow good products to survive.
bill_yang · 31d ago
Builder of rybbit here - I will probably add a free tier in the following weeks. I didn't was because I was scared of being overloaded by an influx of free users, but that doesn't scare me anymore.

I started working on this 4 months ago and only publicly launched a few days ago.

As for monetization, I have no idea yet. I'm happy to collect stars for the time being. What do you think I should do?

nm980 · 31d ago
Not sure, but I'm definitely interested in following your business and seeing what your strategy will become because I was building something similar but when larger teams starting releasing free solutions I couldn't think of a way to compete. Best of luck.
thwarted · 31d ago
> if someone is a PostHog customer they probably won't be using Ahref web analytics

It's (un)surprisingly common to end up with multiple website analytics products on the same site; marketing wants these two, another department wants another. When I had ghostery show the list of things it was blocking I often saw multiple, overlapping-feature-set analytics integrations being blocked on the same site.

openplatypus · 31d ago
Yes, I have seen organizations with websites that had 15+ trackers because every person in the company had their favourite tool.
xyzzy_plugh · 31d ago
I've seen companies from the inside where those 15+ trackers were all added by one person in the span of a week.

I've also seen those trackers be added by someone who exits the organization a month later there by blessing the trackers with a protection spell making their removal unlikely for fear of breaking some metric pipeline somewhere.

crowcroft · 31d ago
You would be surprised how many companies do in fact use multiple analytics services.

Only one tool will be a 'source of truth' but a company using a combination of something like GA4 (Business source of truth), Mixpanel (product insights), and Clarity (Landing page analysis) is not unheard of.

The types of companies that use multiple services are also the types of companies that are likely spending $1,000s per month as well, so overall quite a profitable industry for many companies to operate in.

XCSme · 27d ago
I can only speak for myself (I made UXWizz): I feel like interest has always been there for different platforms, most are still too simple or too complex and hard to set up and use.

In my case, I am simply focused on the self-hosting niche, trying to make the best self-hosting experience. I have an advantage here, because most other tools earn their money from their cloud version, so they don't really want you to self-host, thus usually provide different "open-source" versions and rarely provide support for it.

Also, because "cloud-focused" analytics are built for scale, they are actually not optimal for tracking smaller amounts of traffic (most websites don't have millions of visitors per month), so they use more resources for running scale-proof stacks.

have_orange · 25d ago
I'm working on releasing a product and I am trying to figure out between: self-hosted and SAAS model. My question is: I want to release it as self-hosted, but I think the risk of giving the customer the source code is too high... they can just release it as open-source or sell it at a lower price, so my business is dead? how do you do it? Is it worth trying to obfuscate code or compile it as a binary so they cannot access the code?

Also, with it being self-hosted, how are you chargin a monthly fee? If you are chargin for a monthly fee, can't the customer just remove the product licence validity check? e.g. they remove verification that they have purchased a licence?

Any insight highly appreciated.

XCSme · 24d ago
You can't beat piracy. Look at video-games. Just make it easier to install and set-up when purchasing than when pirating. Also, the automatic updater only works with a valid license key and support is also only provided with a valid key.

I don't charge monthly, I charge for having an active support/updates period, eg: https://license-api.uxwizz.com/support/prices

have_orange · 23d ago
OK, firstly thank you so much for the reply! I have also sent you a DM on X to try and get your input, but you already replied here so cheers!

So I was focused on the wrong thing… focusing on rem ing the possibility of fraud/theft/piracy etc… but what you are saying is focus on charging for product updates I.e. new releases which require a valid key + direct support. Thanks for helping me get this straight in my head.

I feel that selling as a self-hosted model will allow me to operate as a solo founder without the overhead of legal complications with GDPR/Security certifications/ owning the liability of having my customers data… if it is self-hosted they can do what the fuck they want with their own customer data and I don’t need to own that liability… this was my thinking… but then obviously I made the error of focusing too much on trying to avoid theft. Thanks again. I have followed you on X, would be great to connect there. Wishing you lots of continued success! Appreciate your time and advice. Matt

xyst · 31d ago
Posthog is pretty good but very pushy towards using their SaaS (understandably). Self hosting is not really advertised on their main site however is buried in their gh repo as a footnote [1] with indications of vague issues past 100K events/month. Haven’t delved into how to scale it past that though and they do provide some docs that I have yet to review.

Also the primary repo is not FOSS, and that "100% FOSS" repo is buried in yet another footnote [2].

Plausible follows in PH footsteps but is not fully faithful to open source. If you want to self host, you won’t have same set of features as their SaaS and need to rely on long term releases for their "community edition" [3]

On "Ahrefs", is there even an open source version of their product? I couldn’t easily find it (on mobile). [4]

Maybe I’ll take a look at others you mentioned later but if rybbit can remain faithful to their FOSS roots then I think there’s a real chance of it becoming huge.

For thosw that don’t want to self host (mostly corporate shitholes), rybbit can milk them with their managed SaaS product.

[1] https://github.com/PostHog/posthog?tab=readme-ov-file#self-h...

[2] https://github.com/PostHog/posthog?tab=readme-ov-file#open-s...

[3] https://github.com/plausible/analytics?tab=readme-ov-file#ca...

[4] https://ahrefs.com/

bill_yang · 31d ago
I think Posthog is incredible, and there's no way I (it's just been me building rybbit for the past few months) will be able to compete with them on their full scope of features for the foreseeable future.

I tried to self host Posthog for my other project as it far exceeded even the generous free tier. I have a Hetzner bare metal server with 64gb of ram https://www.hetzner.com/dedicated-rootserver/ax42/ and it was running all 16 cores at 100% and didn't end up working. So I think Posthog's stack is just way too heavy to self host effectively, and it's just not in the same category as Plausible, Umami, or Rybbit.

I'm trying to build best OSS analytics out there - and even though it's super crowded, most non-trivial websites run one so there is space for everyone to survive in.

nm980 · 31d ago
> "Self hosting is not really advertised on their main site"

How would rybbit.io make money if they are only better at self hosting? Wouldn't the users they are targeting only self host anyways?

> "On "Ahrefs", is there even an open source version of their product? I couldn’t easily find it (on mobile)."

Not all of these companies are open source but they are still competitors because they have generous free tiers so the cost of self hosting an alternative wouldn't be justified.

XCSme · 27d ago
Yeah, this is why I think cloud-based with free self-hosted version doesn't work, because they are basically competing with themselves if the self-hosted version works too well.
neves · 31d ago
Are these open source and locally hosted? Or you must share your data with a big corporation to use them?
nm980 · 31d ago
PostHog and Plausible are both open source and not backed by big corporations but if sharing data to third parties and being open source is a concern (which seems to be the selling point rybbit.io is targeting) I would expect users to self host instead of paying for a hosted plan anyways?
pc86 · 31d ago
Is sharing your data with a startup or small company any better than sharing it with a big corporation?
haswell · 31d ago
Potentially yes, but depends very much on the privacy policy and data handling promises being made.

I think the instinct to distrust big companies is at least partly because many of them have already proven not to be good stewards of data which when combined with their scale has more worrisome implications.

With a smaller/newer player, at least there’s some hope that they’re not capable of the same harms at a smaller scale, and in some cases may market themselves specifically as a more private alternative.

Whether or not this turns out to be true in practice and over the long run is another thing.

betterThanTexas · 31d ago
Hope ain't the same thing as trust, though. A small player would need to make a pretty significant effort to suggest they wouldn't abuse your usage-patterns.
j16sdiz · 31d ago
Startup have high chance being acquired. The new owner will get all your data, no matter you trust them or not.
haswell · 31d ago
Yeah, that’s the big issue and what I was alluding to in the last paragraph.

It’d be nice to have products that aren’t created with the aim of being acquired, and/or companies that remain committed to their original mission.

dec0dedab0de · 31d ago
It's open source and locally hosted, you don't have to share your data with anyone.
betterThanTexas · 31d ago
> Or you must share your data with a big corporation to use them?

I'm choking on the irony

dec0dedab0de · 31d ago
It's open source, why would you also need a free tier for hosting?
nm980 · 31d ago
Self hosting would cost more money than free tier for most companies.

> more than 90% of companies use PostHog for free.

https://posthog.com/pricing

Onavo · 31d ago
Which ones are embeddable? Often I need to embed some analytic charts for users in my app e.g. blog views and very few support easy embedding+authentication.
XCSme · 27d ago
I am actually implementing the widgets/embedding feature in UXWizz.

What type of authentication would you need? A simple token in the embedding URL? Or you want a way in which you can publicly share the URL, but require a password/auth?

luckylion · 31d ago
Grafana isn't a Google Analytics alternative. You can build a lot of what you need with it (I've done that), but you still need to manage the actual Analytics part separately, Grafana only gives you the visualization.

It's okay, but I probably wouldn't choose it again. The ease of setting up Dashboards and Panels is great at first, but you pay for it with a low ceiling of what you can do (without building around it) and a "we trust everyone" approach to security.

betterThanTexas · 31d ago
> actual Analytics

I've never used google analytics before. What's the marginal value over statsd?

luckylion · 30d ago
Primarily the ease of use. You add the JS to your site and don't worry about anything else, collection is very fast (delivering data is < 30ms) and has edge-servers around the globe.
tonyhart7 · 31d ago
its crowded but only for web, I still searching the one for desktop and mobile (posthog still the best imo)
steviedotboston · 31d ago
Clarity is more of a Hotjar competitor, right?
XCSme · 27d ago
Yeah, but free as far as I know.

You usually have:

All-in-one analytics: Posthog, Matomo, UXWizz (I made it!)

Simple analytics: Plausible, Umami

Qualitative analytics (heatmaps/recordings): Hotjar, Clarity, FullStory, Mouseflow, LuckyOrange, etc.

I think Clarity also added more analytics features, but you're still sending all your data to Microsoft.

nm980 · 31d ago
It also tracks page views, referrers, geographic location, and other analytics common to rybbit
jhpacker · 31d ago
Yes, it's very much like HotJar, focused on session capture & heatmap.
indiantinker · 31d ago
Umami works for me. I just want that dopamine kick that someone clicked on my page so I dont feel lonely on the internet.
threatripper · 31d ago
Would you be interested in a service that occasionally reads your page and sends you thoughtful comments?
bitbasher · 31d ago
It was only a bot, but if it makes you feel better... :)
1dom · 31d ago
My experience is that out of my other logs/metrics (cloudflare & server logs) Umami was the only one that didn't overinflate by counting bots and crawlers.

I know the true state of my site visitors: the vast majority of legit visitor traffic is from my own home IP, and my own mobile IP. Umami was the only one to show that.

Apreche · 31d ago
For me, the best Google analytics replacement has been nothing. Just don’t do analytics at all. Your web site will still work without it. In fact, it will work better!
crazygringo · 31d ago
> Your web site will still work without it. In fact, it will work better!

It objectively won't.

Analytics tell you where your website isn't working, so you can fix it. Buttons you thought were obvious that users are blind to. Pages where nobody scrolls because they didn't realize there was more content. Figuring out where users get stuck because they don't understand the navigation you designed. Etc etc etc.

If you have a hobby website, then sure maybe analytics don't matter. But the idea that sites work better without analytics makes as much sense as saying you'll see better when you wear dark sunglasses.

mindcrash · 31d ago
Once upon a time we did analytics and error analysis by running shell scripts executing awk, sed and grep over a apache or nginx access log or error log.

What I am trying to say is that you can still do analytics, even pretty advanced stuff with some more elaborate scripting, if you want. The only thing you need is the access log.

Something which has been largely forgotten ever since tools like Urchin became a thing :)

ordersofmag · 31d ago
Except if any of your pages are cached between eyeball and your server and so your server logs don't capture everything that is going on. You can get fancy with web server logs, but depending on what you're trying to understand it may not be the data you need.

<source: did fancy things with logs over the last 25 years, including running multiple tools on the same site in parallel to do comparisons (Analog, AWStats Urchin, GA, Omniture, homegrown, etc...)>

codingdave · 31d ago
If you control the cache layer, log it there. If you don't control the cache layer, does a read from the end user cache really count as a separate visit anyway?
ordersofmag · 31d ago
There are plenty of situations where someone visiting a page once and someone repeatedly looking at that page over a period of days (even if it is pulled from their browser cache) is an important difference. Obviously it depends on what you're using the data to try to understand.
hinkley · 31d ago
This is how you end up with no-cache assets on pages so they can keep track of actual traffic.
pc86 · 31d ago
One of the greatest jobs I ever had from a technical perspective had terabytes of structured access logs hosted on prem inside of a VPN, with a few small bespoke tools to search through them (and many more pages of commands for common tasks not yet implemented in a UI).

Not a single line of tracking or analytics on the front end, we just tracked everything we cared about at the server level.

closewith · 31d ago
And most likely a compliance and legal nightmare waiting to drop on a DPO one day.
pc86 · 31d ago
That place didn't have any European operations so no GDPR concerns¹, but for what its worth it was completely.. pseudonymous I think is the term we want? You couldn't link a server entry to an actual user account by any means² but you could group distinct server calls together as coming from the same person. These weren't "server logs" in the same of IPs or user agents or that kind of thing. More like application logs w/ scrubbed/obfuscated user data just stored in gigantic text files.

¹ To those who would say it doesn't matter, I'd say that laws aren't laws if they can't be enforced and there's no enforcement mechanism for some EU bureaucrat to fine a company with no operations outside of the US.

² I'm sure the technical means existed to do it especially if you already had access to the logs but the point is we weren't explicitly storing any PII or data that was linked to a real account. Just actions throughout the apps.

closewith · 31d ago
However, if you do this, you will still need to comply with all relevant privacy laws.

For example, in the EU, you need user consent to use server logs that include IP addresses for analytics. You also need to provide post-consent opt-outs and privacy statements and audit logs and all off a sudden you're building another analytics tool.

cortesoft · 31d ago
How exactly does that work? You need consent for server logs? Am I able to run fail2ban without consent?
closewith · 31d ago
In the EU, IP addresses are personal data and you need a legal basis for each form of processing. You could make an argument that Fail2Ban falls under legitimate interest, but there is now precedent that analytics must have user consent and another legal basis will not be accepted.
cccbbbaaa · 31d ago
No, logs don't require consent in that case, see recital 49.
cptskippy · 31d ago
> Urchin

Urchin was acquired by Google and was ultimately sunset in favor of Google Analytics. It supported local and hybrid analytics models, the later arguably evolved into Google Analytics.

paxys · 31d ago
Such a product will work fantastic until you get your first user.
dylan604 · 31d ago
That's just not realistic though. People with marketing departments need analytics. Otherwise, they atrophy and reveal to everyone they are not as necessary as led to believe. People without marketing departments probably never look at the logs like you.
jsheard · 31d ago
True, but for personal/hobby sites you probably are just better off just not knowing. Nothing good comes of tying your self-worth to how much attention you think you're getting.
sneak · 31d ago
There is nothing to suggest that people who want to measure (and perhaps increase) their publishing reach are “tying [their] self-worth to how much attention [they] think [they’re] getting”.

This is sort of like assuming everyone who is taking photos at a tourist attraction is doing so to show off their holiday for social status.

If your site or content is truly valuable, it is a public good to monitor, analyze, and improve upon its reach and usability.

cortesoft · 31d ago
I think most people are talking about for business websites
crazygringo · 31d ago
Why would you jump to the conclusion that any of it is about "self-worth"?

Maybe you're writing for an audience and you want to see what resonates most with them.

Sometimes popularity is a good thing to measure, not for your ego, but by how much you are helping others.

It is sad when people assume metrics are about vanity, rather than about how much we're helping others.

closewith · 31d ago
> Otherwise, they atrophy and reveal to everyone they are not as necessary as led to believe.

In my experience, when analytics and the related ads tracking tools break, Marketing departments are revealed to be much more important than generally believed in the business.

SchemaLoad · 31d ago
Product people need analytics too. You need to know how many people use each feature to make informed decisions on what needs to be invested in, what should be cut, etc.
cortesoft · 31d ago
I can't imagine someone trying to run a web business with no analytics.
vivzkestrel · 31d ago
this is some kinda joke right? analytics are necessary for 10000 reasons
eGP9jDq_nw · 31d ago
.. may we see them?
XCSme · 27d ago
Tracking conversions, errors, page load speeds, bot traffic, traffic sources, marketing campaigns, top pages, etc.

It helps run your business and make sure everything works as expected, and that you don't waste money on ads/marketing/seo/traffic, etc.

dhosek · 31d ago
There were a gajillion of these things before Google Analytics. Probably the best options were those that relied on log analysis rather than having a JavaScript bug on every page.
AndrewStephens · 31d ago
The documentation states that rybbit does not use cookies and is compliant with the GDPR. The first part is true but, looking at the code (very nice to have it available), the tracking is done by IP address, trading one piece of tracking data for another.

I realize that this is probably the only way it could work but it is not clear to me that tracking by IP address (even over a single session and shredding the data once a day) is any better from a GDPR standpoint.

KronisLV · 31d ago
It doesn't have that much in the way of fancy UI, but I found that Matomo allows you to both choose whether to use cookies / IP or maybe to cut off parts of the IP as well: https://matomo.org/faq/general/configure-privacy-settings-in...

People seem to occasionally post cool new solutions, though it doesn't seem like Matomo has gotten that much attention, despite being a pretty strong alternative to Google Analytics (I haven't had that many issues while self-hosting it either).

AJMaxwell · 31d ago
I have been using Matomo along side GA4 for a month now. The amount of useful data coming from Matomo, even anonymized, is more expansive and easier to access than GA4. Plus self-hosting was pretty easy and it keeps the data on our servers, which just feels right.
9283409232 · 31d ago
I deal with GDPR daily and the truth is that GDPR enforcement doesn't understand what is acceptable from a GDPR standpoint and that is likely why they are in the process of revamping it. You can also anonymize data and that is no longer considered personal data under GDPR so it is possible to hash an IP address and that be acceptable.
Fraaaank · 31d ago
> You can also anonymize data and that is no longer considered personal data under GDPR so it is possible to hash an IP address and that be acceptable.

That's not completely true. Recital 26 of GDPR stipulates that

> “information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable.”

Hashing does not meet this threshold. If the same IP address is hashed using the same method, the result will always be the same, meaning it can be matched. Hashing is therefore considered pseudonimization and under GDPR, pseudonymized data is still considered personal data.

Moreover, the act of anonymization itself is a form of processing and therefore falls under the scope of GDPR. So even attempting to anonymize personal data doesn't remove GDPR obligations for the anonimyzation itself.

robbie-c · 31d ago
Disclaimer: IANAL

> If the same IP address is hashed using the same method, the result will always be the same, meaning it can be matched.

The way people get around this is by using an ephemeral salt, that is deleted e.g. daily. After enough time has passed, it'd be impossible to reverse the hash as the salt would be lost.

rustc · 31d ago
Plausible uses the same algorithm and they have a page written by a lawyer claiming this is GDPR compliant: https://plausible.io/blog/legal-assessment-gdpr-eprivacy

Edit: Found more discussion here: https://github.com/plausible/analytics/discussions/1963#disc...

> To summarize, I believe the EDPB has made their position very clear on this in their 2023 guidelines: Plausible's fingerprinting is subject to Article 5(3) of the ePD. Plausible has made their position very clear in their blog post, leaning in the other direction. Until this is tried out in court, I don't believe that there will be any definitive answer.

jhpacker · 31d ago
Unlike Plausible and Fathom, it looks like Rybbit is NOT salting by default ( (but that it's an option to enable per site: https://www.rybbit.io/docs/enhanced-privacy). Which is why they can offer retention reporting.

This seems incompatible with ePD.

dkga · 31d ago
So IP is considered personal information?
cccbbbaaa · 31d ago
Yes, that is what case C-582/14 concluded.
keerthiko · 31d ago
If the IP address is hashed somehow it would no longer be personally identifying while still being unique enough for analytics purposes, correct?

Does geographic grouping data depend on the IP address? If so I suppose it would need to be extracted first before hashing the IP, and I wonder how much that weakens the anonymization.

kevin_thibedeau · 31d ago
You can hash every IPV4 for a rainbow table. Needs some salt.
SquareWheel · 31d ago
According to the author, Rybbit hashes IPs with a daily rotating salt.

https://www.reddit.com/r/selfhosted/comments/1kgytl4/i_built...

dylan604 · 31d ago
Okay, but that doesn't mean the concept is bad.
lmkg · 31d ago
Yes it does.

If a user can say "here's my IP address, what data do you have on me?" and you can answer that question, then that's personal data under GDPR. It's pseudynomized, but not anonymized, and pseudynomous data is personal data.

wizzwizz4 · 31d ago
Even if you can't answer that question, if it can be answered, that's still personal data.
dylan604 · 31d ago
What's the minimum size of an operation before the GDPR kicks in? In other words, are all sites governed by GDPR, or are some companies considered too small to be under the GDPR regulations? I know that there are some regulations that get a pass for smaller outfits. I know nothing about GDPR as a European audience is not my target and not kowtowing for them.
lmkg · 31d ago
GDPR does not currently have explicit business size thresholds. Its provisions are all framed as personal rights of the data subject, so its provisions are always in effect. By contrast, CCPA in California is framed as a consumer protection law so it only applies to companies of a certain size.

In practice, small fries are not an enforcement priority. Regulators in most countries are not well-funded so they have to be frugal with their enforcement actions.

The EU is currently reviewing an option to relax GDPR requirements for smaller businesses. Not remove GDPR requirements, just streamline some of the process overhead.

wqtz · 31d ago
The jury is out on ip address vs GDPR. Hashed IP address is not anonymous, nor is last digit anonymization anonymous.

So, let's not bother with it. I can say all IP address are located in earth and someone would be offended because now we are invading their privacy by knowing which planet they are from. GDPR is not clear on IP address or IP address derived metadata. There is no case law for it, nor acceptable methodology and everyone is speculating about what are the consequences of and it is mostly just opinions from IANALs. GDPR is astrology for non-enterprise companies.

cccbbbaaa · 31d ago
> GDPR is not clear on IP address or IP address derived metadata. There is no case law for it,

There is, see C-582/14 which concludes that IP address, even dynamic, are personal data.

autoexec · 31d ago
If people insist on tracking users with analytics, the least folks can do is use something other than google to do it.
nadermx · 31d ago
If you don't want to roll your own and don't care if its open source, I've used clicky.com for years. Simple, and shows everything I need. As others have said, it's a crowded market. Still cool though that people are launching these projects.
slig · 31d ago
+1. And it's dirty cheap. I pay about $200/m for 800k daily events.
ray023 · 31d ago
Well, obvious question: How does it compare to Plausible and all the other open source analytics.
colesantiago · 31d ago
Plausible is too needlessly expensive as one grows and it essentially punishes you for growing.

And some features aren't available 1:1 with the CE version of Plausible either.

bill_yang · 31d ago
Yea, funnels are not open source for Plausible
bill_yang · 31d ago
Check out our demo at https://demo.rybbit.io/1. We have a lot more features than Plausible, but they're still presented in a way that is intuitive to use. You shouldn't need to read pages and pages of documentation to be able to set up funnels on rybbit, for example.
marvinblum · 31d ago
I keep recommending it, but you can check out https://european-alternatives.eu/category/web-analytics-serv... for a more complete list of EU based web analytic services.

I'm one of the co-founders of Pirsch, and a bit worried because the space is getting really crowded :D

vanschelven · 31d ago
I love it, too bad the author of that project seems to have become overwhelmed with the requests for updates/additions.
kull · 31d ago
Why not Matomo?
tacker2000 · 31d ago
Upvote for matomo!

This project here looks interesting, but is quite new. Lets see how it evolves in the future.

ordersofmag · 31d ago
Matomo is an evolution of Piwik which was first released in 2007. So not 'quite new'.
tacker2000 · 31d ago
Im talking about the project OP posted, not matomo.
bill_yang · 31d ago
Hey I built this! I was meaning to launch Rybbit on show HN tomorrow morning but I guess you beat me to it haha.
dotandgtfo · 31d ago
Plausible, Fathom, Umami and the others all do cookie-less tracking too. Why don't you add an option to track through cookies? Most serious businesses have the consent for putting analytics cookies there, especially if it's a first-party cookie. This will differentiate you and instantly make this a more serious option for self-hosters who want simple but reliable tracking. Especially if you can set this as a serverside first-party HTTPS cookie.

Alternatively, add an identify() call and let others roll their own solution for this.

Then I would actually trust your retention numbers.

Great stuff though! Impressive launch.

BaudouinVH · 31d ago
XCSme · 27d ago
I didn't make it on that list :(
ceving · 31d ago
codazoda · 31d ago
Because I like minimalist tools, onedollarstats.com looks interesting to me. I can’t find much info about their privacy posture (which prevents me from using Google Analytics). I use my own counter, but it’s got very limited features.
VladVladikoff · 31d ago
Is there any server side only analytics software that is open source and decent? I really don’t want to add any more JavaScript to my pages that I don’t need.
CliffyA · 31d ago
I have some static websites on S3 and CloudFront. I use https://goaccess.io/ to parse the logs and generate a html report.

As mentioned elsewhere in the thread, there is a lot of bot activity there, that using JS might cleanup a bit.

If you are interested, I have a write up of my setup here, with the report generation down at the bottom: https://gamestrut.com/Blog/2022-06-08-static-site-hosting-on...

GordonS · 31d ago
Umami has an API, so it can be used server-side, without any JavaScript (I use it like that).
ksec · 31d ago
GoAccess?
cyberax · 31d ago
Is there anything that can work with the request logs instead of the usual transparent pixels and/or script inclusions?
bill_yang · 31d ago
I'm going to develop server-side SDKs in the future. But many existing platforms like posthog already support this, though I don't know if they support processing raw request logs literally.
cyberax · 31d ago
It'd be great. Posthog is a bit too heavy for just that.
ksec · 31d ago
This seems, very similar to Umami. Is this a fork from them? TypeScript / Next.JS and similar design?
bill_yang · 31d ago
I wrote this fully written from scratch. Similar stack to Umami though.
ksec · 31d ago
Good luck. It is looking the best out of all the alternatives so far.
nh2 · 31d ago
Probably one of the coolest logos I've seen so far. How did you come up with it?
Sephr · 31d ago
Can the creator /user/bill_yang explain this contradiction in the readme? It claims to support tracking unique users, but then immediately afterwards claims not to track users. Do you mean that you don't do any user tracking by default?

> Key Features

> - All key web analytics metrics including sessions, unique users, pageviews, bounce rate, session duration

> - No cookies or user tracking - GDPR & CCPA compliant

internetter · 31d ago
the industry has decided that unique users can be done without being tracking. you can decide differently. Often times how they do it is some combination of

- Hashing IPs with other data to form a non-reversible UUID

- Not tracking across multiple host domains

- Not setting cookies or storing any information in the browser

newusertoday · 31d ago
very nice demo. I saw that you are using threejs but when i checked network logs its not downloading it which is great. Are you doing SSR?
bill_yang · 31d ago
Thank you. I am using https://globe.gl/ which wraps three.js. The page realtime page is still pretty slow to load so though.

I'm using Next.js but I'm using all client-side components. The tooling around SPA client side state is just really good so I don't see a huge reason to go full SSR, especially when SEO doesn't matter for the actual app.

karolist · 31d ago
I'm hosting my blog on cloudflare pages, it's analytics show 80 or so uniques every day consistently even though I barely write there. Installed Umami - 0 visitors. None. Internet is just LLM crawlers hungry for content now?
lmkg · 31d ago
We passed the tipping point where bot traffic outnumbered human traffic fifteen years ago. LLMs are an order of magnitude worse by most first-hand accounts, but it's just a continuation of a very long trend.
tonyhart7 · 31d ago
"Internet is just LLM crawlers hungry for content now?"

its been that way for a few years, real users using mobile app and access social media now

the percentage internet user who "surfing" on the web is dwindling and more likely diminish in near future

sltr · 31d ago
I see this too on my CF Pages-hosted blog.

Analytics only work if the agent runs JS. CF on the other hand counts file fetches, which can't be circumvented.

There's always a baseline of bot traffic.

karolist · 31d ago
ah, that explains it, I think. I expected them to sessionize the file transfers under one unique somehow still, even without JS.
miragecraft · 31d ago
A couple of days ago I was researching website analytics and GDPR/cookie law, and it seems clear that you need user consent even if IP addresses are only processed or temporarily stored before being discarded.

Arguing otherwise is like claiming it’s legal to steal from a store as long as you return the goods the next day - it’s legal fantasy.

I don’t think the EU is eager to go after these “ethical” analytics companies or their users, since they have bigger fish to fry. But if you think you’re legally in the clear using these solutions without user consent, you’re fooling yourself.

XCSme · 27d ago
The law will change soon as far as I know, but still, the best way to respect data privacy laws is to not send your data to other companies AND to avoid tracking personal and sensitive data as much as possible. If you self-host and don't share the tracked data, you are already doing better than 99% of the companies
dns_snek · 31d ago
> it seems clear that

Can you elaborate?

miragecraft · 31d ago
The logic is simple, as soon as you collected and/or processed IP addresses, you need user consent as it is personal data.

You don’t get to “undo” this requirement by discarding the IP address afterwards, the law doesn’t care.

Others have come to the same conclusion: https://github.com/plausible/analytics/discussions/1963

dns_snek · 31d ago
I see, I was confused because you mentioned GDPR but it has everything to do with ePD and I wasn't aware of this issue, thanks for sharing!

> Arguing otherwise is like claiming it’s legal to steal from a store as long as you return the goods the next day - it’s legal fantasy.

That said, this strongly implies that these privacy-focused analytics platforms are unquestionably breaking the GDPR and behaving in an unethical way, but that seems like a huge overstatement.

I've read the linked blog post and it seems like the analysis hinges on the precise wording of the ePD rather than GDPR. By their own admission, these analytics solutions seem to be in line with both the letter and the spirit of GDPR. The author even agrees that the wording of the ePD should be addressed and notes:

> Unfortunately I came to the rather demotivating conclusion that there simply isn’t any way to implement web analytics without running afoul of the ePrivacy Directive.

> This was a surprising conclusion at the time. Morally we can go very far: we can put a lot of smart stuff together and create a system that can’t be used to track individual users. But legally, that doesn’t particularly matter. The ePrivacy Directive is written as it is.

> Even the EU Data Protection Working Party decries this. In their 2012 opinion they write:

> the Working Party considers that first party analytics cookies are not likely to create a privacy risk when they are strictly limited to first party aggregated statistical purposes and when they are used by websites that already provide clear information about these cookies in their privacy policy as well as adequate privacy safeguards. […] In this regard, should article 5.3 of the Directive 2002/58/EC be re-visited in the future, the European legislator might appropriately add a third exemption criterion to consent for cookies that are strictly limited to first party anonymized and aggregated statistical purposes.

So it's not that these companies are doing anything inherently immoral or unethical as far as their handling of personal data goes, but they might be behaving unethically by making claims that run afoul of other legislation (ePD) that clashes with the GDPR.

No comments yet

gitroom · 31d ago
so many takes here tbh, i always end up just picking the simplest tool and hoping for the best - you think real privacy or just ease of use is what ends up mattering more long-term?
dkga · 31d ago
Interesting!