PyG 2.0: Scalable Learning on Real World Graphs (arxiv.org)
2 points by PaulHoule 15m ago 1 comments
Better Safe Than Sorry: Model Context Protocol (ioactive.com)
1 points by mooreds 33m ago 0 comments
Ask HN: What are you *not* working on?
7 password4321 8 8/16/2025, 4:02:48 PM
In the spirit of an casting an even wider net in a low-key weekend discussion, I'd like to hear more about projects that are no longer a priority and even just ideas perhaps offered with the hope that someone else can at least share their feedback or even make your dream come true!
User david927's unofficial monthly "What are you working on?" (July: https://news.ycombinator.com/item?id=44702833) are always a hit with many comments about projects either not quite ready for their own 'Show HN' or linking to one that didn't get traction... I'd like to use this submission to open the discussion up to include the not-yet/currently-makers.
(2) I have a QooCam Ego stereo camera so far my main way of sharing images is the red/cyan anaglyph which only goes so far. I want to make something for sharing them online, that probably includes some UI I can do to adjust the ‘stereo window’ by shifting the images horizontally and something where you can walk around in an art gallery. It can all be done with WebXR, three.js, A-frame and all that but to get it to work on the Meta Quest 3 I’ll need to be careful about memory management. Might do it someday but I am so busy taking photos, developing photos, posting on my socials, and trying to sell more sports and event work.
(3) I want to build a general purpose text classification framework that automatically trains and selects from: bag of words, ModernBERT + pooling + Classical ML , and ModernBERT + (Bi)LSTM. I was working to build one in 2017 at a job which was marginally successful, today I know a lot more and the technology is better. This is blocked by transitioning my ‘nemesis’ system for collecting training data to Postgres away from arango though maybe that’s an excuse because if I didn’t care about my other projects in the queue I could start working from Kaggle downloads which I need to do anyway because it has to be ‘general’ and not just work for my problems.
Re: #2 VR - there are a number of one-and-done demos rendering 2D or 3D images/models for a VR gallery on github, though none seem too concerned about performance. i'm not sure if you'd get more traction perfecting that or just focusing on your stereo camera (which might be valuable just as a page to view one image but ultra niche).
Sorry I don't know enough about text classification so instead I'd ask if you could share the best resource you used to get started working them? It seems to be necessary to have the best foundational understanding as LLMs take over.
The Java program is a mess though, and there are a lot of chess programs out there, both of those programs are based on knowledge I got from
https://www.chessprogramming.org/Main_Page
I've made some really simple demos of VR rendering that play well on my MQ3 when it is attached to a computer which is running the web browser -- they tend to choke running in standalone mode which I think is the main market so the product I want is going to be one tuned up for memory use. I think the problem is that the MQ3 just doesn't have a lot of RAM and a lot of of it taken up by the OS and the standard UI. A single DSLR photograph would be like 6000x4000x3 = 72MB as a texture and it would have to be unpacked to view it. I've seen some good demos that run in WebXR on the MQ3 so I know it's possible but I'll have to really tune it.
As for text classification I have an RSS reader that uses text classification for a recommender, here it is running on the MQ3:
https://mastodon.social/@UP8/114910543438621522
It downloads maybe 20,000 RSS feed items, runs them through
https://sbert.net/
to turn them into vectors (easy!) and then clusters them with k-means clustering to divide them into 20 "topics"
https://scikit-learn.org/stable/modules/generated/sklearn.cl...
it picks the 10 highest scoring article out of each cluster and adds another 100 randomly chosen articles to show me 300 articles. I give a thumbs up or thumbs down judgement of each article and use the vectors as X and the judgements as y for
https://scikit-learn.org/stable/modules/svm.html
with the probability option to compute the scores.
This system is super-reliable and fast, in three minutes it trains something like 20 models and picks the best.
My approach based on SBERT gets a good sense of the "gist" of something but doesn't really understand the order of words, can't handle negation, is not so good for sentiment analysis and more complex kinds of tasks -- but my recommender doesn't really need a highly accurate model because my judgements are not accurate, I might judge the same article up or down depending on how I feel that day.
People who hold court on the forums on huggingface tend to advocate "fine tuned BERT models" for classification, I think they for the birds. I see a lot of arXiv papers where people copy a training recipe from another paper, I haven't seen a recipe that consistently makes good models -- I don't want to write papers, I want a system that a person who just has text and judgements can push a button and get a good classifier for a wide range of problems.
I've worked on LSTM trainers in the past and found I could develop reliable training procedures for them, the literature tends to show these often beat the "fine tuned BERT" by a bit, so I am really interested in making one that "just works", like you give it documents as your X and your judgements as y and it will train a bunch of models and give you the best.
https://scikit-learn.org/stable/model_selection.html
> I want a system that a person who just has text and judgements can push a button and get a good classifier
This seems very practical. These systems could add a lot of additional value filtering sites like HN to follow the trends for technologies someone is interested in, or drilling down into popular trends as a tool to find what to generate content about when building an audience.
- An entirely user-mode Docker container runtime for Windows, which based on my initial research would have to be qemu running Linux slowly. Maybe it could all be WASM'd into the browser (even more slowly)?
- Nested Windows RDP X509 [smart card] authentication through Guacamole or other browser-based RDP client into another RDP session.
- Free backup as formerly offered by CrashPlan "Home Family" sharing ended in 2017, using empty hard drive space of people you know and trust. Version 2 would support a virtual file system allowing specifying the number of redundant copies of replaceable rarely/never used operating system and software files to pull over the network when needed instead of duplicating on all computers, ideally with a slider choosing between disk space and network bandwidth. I considered licensing VirusTotal so my file system for anything that might exist on another computer is basically a SHA256 hash list + local cache, but that costs enough it would have to be a commercial business.
- My no longer supported Amazon Glow, a $250 refunded Amazon experiment that was the most awesome piece of technology connecting children with their relatives I've ever seen. I'd like to be able to deploy my own software but I'm guessing it remains as locked down as the Echo etc. https://www.reddit.com/r/hardwarehacking/comments/119acbh/re... https://instrumental.com/resources/teardown/amazon-glow/
it was a way to generate FCP (Final Cut Pro) xml files.
Was a lot of fun for a few months but that's kinda typical for me. I seem to always move on.
Interesting point on your demo video re: importing from YouTube and enabling hardware acceleration with h_264videotoolbox for the conversion.
Now you just have to find a good search term, pull down the most viewed videos and use YouTube's "most replayed" graph with AI-powered cleanup on where to cut to create an automatic best-of aggregator!