I once ended up participating in rock, paper scissors strategy competition on a whim.
At that time I was quite tired of polishing and repolishing a paper draft at my university. (I write very poorly). And there it was, an announcement of this fun competition. The deadline was just an hour away.
I had no time to implement any respectable algorithm myself.
So, all my submission did was take the history of plays of my opponent, up to the present point in the ongoing joust, and extend it by three possible completions. I would compress each of the three resulting strings. Whichever completion gave the shortest compressed final string, I assumed that to be my opponent's predicted play. I played whatever beat that predicted play.
This did not win the competition, but survived the knockout for many rounds, beyond my expectation. If I remember correctly, I had used plain old zip. A compressor that converges faster would have been better.
In essence my strategy was similar to a Markov chain predictor. It would have been exact match had I used a Markov model based compressor.
The number of rounds of play against one's opponent was not long. Had it been so, Lempel-Ziv being an universal compressor, would have been hard to beat.
Of course we know from game theoretic analysis what the Nash strategy is for this game, but where is the fun in playing that ?
One might think that a transformer based player would be an excellent strategy, not necessarily. One needs to learn potentially non-stationary strategies of the opponent from very few examples - (near) zero-shot online learning is required.
If I had more time, I would have tried context-weighted trees - universal learners for stochastic strings. The failure mode would be non-ergodic plays by the opponent, assuming that, it too is another stochastic parrot.
90s_dev · 1d ago
> I write very poorly
Probably not. Most writing is self-congratulatory garbage. The best writing I've ever read was not meant to be fancy but got right to the point from the heart, by people who couldn't care less what others thought about them. Your writing in this comment was fine and easy to read.
srean · 1d ago
Thanks for the kind validation.
I am still very bad as far as writing research papers go. That's one area my advisor's feedback helped a lot. Anything I wrote would need multiple passes.
My writing's bad, partly because of bad grammar and partly because once the cool results of the research is done, my heart is really not into writing it. It becomes a chore.
EDIT: I see no reason why your comment got down-voted.
90s_dev · 1d ago
Ignore the downvotes. They don't really mean anything on HN, and points fluctuate wildly over the course of 24-48 hours in both directions. I have a theory that HN literally adds random up/down votes to comments periodically just to mess with us.
fidotron · 1d ago
The canonical rps program contest is http://www.rpscontest.com/ and many of the entries get surprisingly good.
Of course these things will always have to assume a semi strategic opponent.
There goes my afternoon, nicely nerd sniped :) Thank you.
fidotron · 1d ago
Thanks for playing everyone.
As mentioned, I was slightly surprised to see this trending this morning given I thought it had gone nowhere in a prior submission.
Some tech details no one asked about:
* Custom WebGL2 renderer
* Leaderboards are in FoundationDB (talk about overkill)
The markov chains are a fairly standard 5 chain ensemble setup. Each chain is configured to process k-(1..5) of your previous moves (for some value of k), and base the estimate of the next move on that. The ensemble mechanism takes care of working out which to go with. This is obviously harder than if it cheated, but it's more fun. The actual markov chain component took about an hour to write, the stats was worse.
Got bored after 24 rounds.
was up 20, 12 wins, 6 losses.
If you know it's learning using markov, you can exploit that.
(Just tried to imagine what the computer would do based on my/his history, and tried to switch my strategie during the game - but i think i could not stick that way of playing for a long time)
avodonosov · 1d ago
My "all time" stats:
Played 145 times
Max credits 49
Max combo 8
Win 56 (39%)
Draw 35 (24%)
Loss 54 (37%)
My "this game" stats:
Played 94 times
Current credits 45
Max credits 49
Max combo 8
Win 39 (41%)
Draw 25 (27%)
Loss 30 (32%)
I was playing mostly as with a living oponent.
When playing I was not exploiting any assumptions about it being Markov learner (I don't even understand how it learns and how to exploit it)
While it can not learn to beat me most of the time, it is playable.
How do you know it learns with markov chain? How exactly? (What states the chain has? Other details?)
Why does camera shakes at every move? To me that's very annoying.
The game needs stats of the total credits I spent including previous rounds, so that I see the total spent/won balance.
I am clearly winning on that measure.
fidotron · 1d ago
Firstly, thanks for playing!
> How do you know it learns with markov chain? How exactly? (What states the chain has? Other details?)
I wrote it! It uses five chains of different lengths, and each of those estimates the next, then a standard way of resolving those to a single signal.
> Why does camera shakes at every move? To me that's very annoying.
Yeah, it is. The first version you actually spun around on the spot too, which was cool but made you sick, even on a phone. Therefore I hacked it to this, but it's not great.
> The game needs stats of the total credits I spent including previous rounds, so that I see the total spent/won balance. I am clearly winning on that measure.
In the menu there is "statistics" which contains more stats than live stats, and that might have more of what you want!
avodonosov · 46m ago
> It uses five chains of different lengths, and each of those estimates the next, then a standard way of resolving those to a single signal.
Can you explain on more details? (if not a trade secret of course)
ponco · 1d ago
I had an old colleague who had a favourite lunch room trick - if challenged to Scissors, Paper, Rock then he would win every single time. I presumed it was some kind of behavioural psychology trick, but it didn't occur to me it's just a pattern recognition. Or maybe it was both. I should get in touch again...
yard2010 · 1d ago
A fun trick my co-worker had taught me: when someone pops in your head like this, send him a message
okwhateverdude · 1d ago
That split second before a throw resolves is a slice of time where you can observe what your opponent is throwing, and change yours at the very last moment.
ljf · 1d ago
I remember telling a friend I’d always beat him at the game, and I always did. I had no more skill than telling him I’d win each time and I pretty much always did especially if it was best of three
croemer · 1d ago
Is there any evidence that it actually uses Markov chains? I don't see any mention. The title here might as well be made up to troll HN?
fidotron · 1d ago
I did write it, it does use Markov chains, and the whole thing should be self evidently a troll.
For example, it is not necessarily trying to beat you. This was inspired by an experience with https://luduxia.com/whichwayround/ where I found most traffic came from people trying to break it, until one day it hit a mailing list and exploded.
mettamage · 1d ago
Why not make it a show HN?
fidotron · 1d ago
I have done other related ones before, and honestly find "Show HN" (understandably) involves a level of expected seriousness that other posts might not. Consequently I prefer to leave that only for things which are more serious in intent.
This one I updated the game a bit, submitted a link, and it went nowhere when I posted it some days ago. I woke to a Linode alert of sustained outgoing traffic and it turns out the post had been resurrected.
jll29 · 1d ago
I was also wondering that - is the poster the author?
In any case Markov chains are a simple and elegant
mechanism for making predictions, easy enough to understand
so that even school kids at grade eleven (which is where
probability theory got introduced when I underwent it)
can follow the popular "weather forecasting" tutorial
using MCs.
Onwards, Hidden Markov Models (HMMs) add to MCs a layer of hidden states and associated emission probabilities. To learn these, check out Lawrence Rabiner's beautiful tutorial:
https://ieeexplore.ieee.org/document/18626
Apart from their simplicity and mathematical elegance it is remarkable how little data and electricity these models require to do a good job.
rfl890 · 1d ago
Check OP's about. They are the author.
sparky_ · 1d ago
Good question. I'm at a 75% win/loss ratio by picking rock every single time. Doesn't appear to be learning in this session, anyway.
bigfishrunning · 1d ago
Good old rock, nothing beats that!
gnarlouse · 1d ago
I lasted 26 rounds with a max of 11 coins. I’m gonna pretend that’s good and not average so I can feel like I didn’t just get owned.
ljf · 1d ago
Wish I’d screenshot it but I quit at 32 points after a series of combos - think I had no more than 15 rounds.
jimkleiber · 1d ago
32 and 10. I normally lose much faster when I play friends.
BrenBarn · 1d ago
I went 37 rounds with a max of 14.
duc_minh · 1d ago
114 rounds and counting with a max of 87 coins, a combo of 7. Guess I have some anti-Markov behaviours ¯\_(ツ)_/¯
sugarkjube · 1d ago
Yeah, if you know it's markov, you can exploit that.
I really like the aesthetic , how is it programmed or animated ?
BrenBarn · 1d ago
Apart from the Markov stuff, there's something hilariously surreal about the graphics and music of this. COMBO BREAKER!
rkagerer · 1d ago
Well there's a few minutes of my life I won't get back.
JohnKemeny · 1d ago
Are there some you do?
sethammons · 1d ago
One could argue that if the memory is accessed ever again, you are reusing that minute. If the memory will never marginally add to your lived experience, it could be viewed as unredeemable and time you don't get to leverage in the future, and thus is time wasted/time lost.
xnickb · 1d ago
> If the memory will never marginally add to your lived experience, it could be viewed as unredeemable
There is also the aspect of reinforcing existing memories which is easy to miss with this criteria. In the sense that any single given experience won't significantly alter your life, but if you remove all of them, suddenly a change happens.
I for once had my gears grinding trying to recall my patterns and predict moves of the "AI". Especially past round 20.
bArray · 1d ago
My score:
38 wins (44%), 21 draws (24%), 28 losses (32%)
Max combo 7
I think I could have kept going too.
sweeter · 1d ago
I got a combo of 6 up to 45. Pretty fun. I tried to condition it into what I would normally do, and then tried to subvert that
At that time I was quite tired of polishing and repolishing a paper draft at my university. (I write very poorly). And there it was, an announcement of this fun competition. The deadline was just an hour away.
I had no time to implement any respectable algorithm myself.
So, all my submission did was take the history of plays of my opponent, up to the present point in the ongoing joust, and extend it by three possible completions. I would compress each of the three resulting strings. Whichever completion gave the shortest compressed final string, I assumed that to be my opponent's predicted play. I played whatever beat that predicted play.
This did not win the competition, but survived the knockout for many rounds, beyond my expectation. If I remember correctly, I had used plain old zip. A compressor that converges faster would have been better.
In essence my strategy was similar to a Markov chain predictor. It would have been exact match had I used a Markov model based compressor.
The number of rounds of play against one's opponent was not long. Had it been so, Lempel-Ziv being an universal compressor, would have been hard to beat.
Of course we know from game theoretic analysis what the Nash strategy is for this game, but where is the fun in playing that ?
One might think that a transformer based player would be an excellent strategy, not necessarily. One needs to learn potentially non-stationary strategies of the opponent from very few examples - (near) zero-shot online learning is required.
If I had more time, I would have tried context-weighted trees - universal learners for stochastic strings. The failure mode would be non-ergodic plays by the opponent, assuming that, it too is another stochastic parrot.
Probably not. Most writing is self-congratulatory garbage. The best writing I've ever read was not meant to be fancy but got right to the point from the heart, by people who couldn't care less what others thought about them. Your writing in this comment was fine and easy to read.
I am still very bad as far as writing research papers go. That's one area my advisor's feedback helped a lot. Anything I wrote would need multiple passes.
My writing's bad, partly because of bad grammar and partly because once the cool results of the research is done, my heart is really not into writing it. It becomes a chore.
EDIT: I see no reason why your comment got down-voted.
Of course these things will always have to assume a semi strategic opponent.
Following your link I found this
https://daniel.lawrence.lu/programming/rps/
Submitted.
https://news.ycombinator.com/item?id=44105859
There goes my afternoon, nicely nerd sniped :) Thank you.
As mentioned, I was slightly surprised to see this trending this morning given I thought it had gone nowhere in a prior submission.
Some tech details no one asked about:
* Custom WebGL2 renderer
* Leaderboards are in FoundationDB (talk about overkill)
The markov chains are a fairly standard 5 chain ensemble setup. Each chain is configured to process k-(1..5) of your previous moves (for some value of k), and base the estimate of the next move on that. The ensemble mechanism takes care of working out which to go with. This is obviously harder than if it cheated, but it's more fun. The actual markov chain component took about an hour to write, the stats was worse.
If you're interested in writing RPS players there have been many efforts http://www.rpscontest.com/ including things as linked elsehwere here like https://daniel.lawrence.lu/programming/rps/ .
When playing I was not exploiting any assumptions about it being Markov learner (I don't even understand how it learns and how to exploit it)
While it can not learn to beat me most of the time, it is playable.
How do you know it learns with markov chain? How exactly? (What states the chain has? Other details?)
Why does camera shakes at every move? To me that's very annoying.
The game needs stats of the total credits I spent including previous rounds, so that I see the total spent/won balance. I am clearly winning on that measure.
> How do you know it learns with markov chain? How exactly? (What states the chain has? Other details?)
I wrote it! It uses five chains of different lengths, and each of those estimates the next, then a standard way of resolving those to a single signal.
> Why does camera shakes at every move? To me that's very annoying.
Yeah, it is. The first version you actually spun around on the spot too, which was cool but made you sick, even on a phone. Therefore I hacked it to this, but it's not great.
> The game needs stats of the total credits I spent including previous rounds, so that I see the total spent/won balance. I am clearly winning on that measure.
In the menu there is "statistics" which contains more stats than live stats, and that might have more of what you want!
Can you explain on more details? (if not a trade secret of course)
For example, it is not necessarily trying to beat you. This was inspired by an experience with https://luduxia.com/whichwayround/ where I found most traffic came from people trying to break it, until one day it hit a mailing list and exploded.
This one I updated the game a bit, submitted a link, and it went nowhere when I posted it some days ago. I woke to a Linode alert of sustained outgoing traffic and it turns out the post had been resurrected.
In any case Markov chains are a simple and elegant mechanism for making predictions, easy enough to understand so that even school kids at grade eleven (which is where probability theory got introduced when I underwent it) can follow the popular "weather forecasting" tutorial using MCs.
Onwards, Hidden Markov Models (HMMs) add to MCs a layer of hidden states and associated emission probabilities. To learn these, check out Lawrence Rabiner's beautiful tutorial: https://ieeexplore.ieee.org/document/18626
Apart from their simplicity and mathematical elegance it is remarkable how little data and electricity these models require to do a good job.
There is also the aspect of reinforcing existing memories which is easy to miss with this criteria. In the sense that any single given experience won't significantly alter your life, but if you remove all of them, suddenly a change happens.
I for once had my gears grinding trying to recall my patterns and predict moves of the "AI". Especially past round 20.
38 wins (44%), 21 draws (24%), 28 losses (32%)
Max combo 7
I think I could have kept going too.