> In quantum information theory, a mix of quantum mechanics and information theory, the Petz recovery map can be thought of a quantum analog of Bayes' theorem
westurner · 20h ago
> Introduction.—Usually demonstrated by simple counting arguments involving urns and balls, Bayes’ rule has actually been argued to play a much deeper role in probability theory and logic, as the only consistent system for updating one’s beliefs in light of new evidence [1, 2, 3, 4, 5, 6]. As an alternative to the above axiomatic approach, Bayes’ rule can also be derived from a variational argument: the updated belief should be consistent with the new observations while deviating as little as possible from the initial belief. This is known as the minimum change principle [7, 8, 9, 10]. It formalizes the intuition that the new information should be incorporated into the agent’s knowledge in the “least committal” way, e.g. without introducing biases unwarranted by the data. Such fundamental insights can be seen as at least a motivation, if not an explanation, for the extraordinary effectiveness of Bayesian statistical inference in virtually all areas of knowledge.
I feel like Bayes' rule is oversold though.
Is just Bayes' rule good enough for fighting spam email, for example?
How large of a Bayesian belief network is necessary to infer the equations of n-body gravity in a fluid, without other fields?
How large of a Bayesian belief network is necessary to extrapolate the motions of the planets?
Then also predict - with resource costs - the perihelion of Mercury; the deviation in the orbit of Mercury predicted by General Relativity and also the Gross-Pitaevskii equation which describes turbulent vortical fluids.
Then also - with Bayesians or a Bayesian belief network - predict the outcomes in (fluidic nonlinear) n-body gravity experiments.
Do Bayesian models converge at lowest cost given randomly initialized
arbitrary priors? Do Bayesian models converge at lowest cost at describing nonlinear complex adaptive systems?
How do Bayesians compare to other methods for function approximation and nonlinear function approximation?
How do quantum Bayesians compare to other methods for function approximation and nonlinear function approximation?
westurner · 20h ago
Furthermore, Bayesian models are not to be applied when there is not statistical independence of observations.
>> Do algorithmic outputs diverge or converge given variance in sequence order of all orthogonal axes? Does it matter which order the dimensions are stated in; is the output sensitive to feature order, but does it converge regardless?
> Also, current LLMs suggest that statistical independence is entirely distinct from orthogonality, which we typically assume with high-dimensional problems. And, many statistical models do not work with non-independent features.
> Does this model work with non-independence or nonlinearity?
> Does the order of the columns in the training data CSV change the alpha of the model; does model output converge regardless of variance in the order of training data?
> In quantum information theory, a mix of quantum mechanics and information theory, the Petz recovery map can be thought of a quantum analog of Bayes' theorem
I feel like Bayes' rule is oversold though.
Is just Bayes' rule good enough for fighting spam email, for example?
How large of a Bayesian belief network is necessary to infer the equations of n-body gravity in a fluid, without other fields?
How large of a Bayesian belief network is necessary to extrapolate the motions of the planets?
Then also predict - with resource costs - the perihelion of Mercury; the deviation in the orbit of Mercury predicted by General Relativity and also the Gross-Pitaevskii equation which describes turbulent vortical fluids.
Then also - with Bayesians or a Bayesian belief network - predict the outcomes in (fluidic nonlinear) n-body gravity experiments.
Do Bayesian models converge at lowest cost given randomly initialized arbitrary priors? Do Bayesian models converge at lowest cost at describing nonlinear complex adaptive systems?
How do Bayesians compare to other methods for function approximation and nonlinear function approximation?
How do quantum Bayesians compare to other methods for function approximation and nonlinear function approximation?
"LightGBM Predict on Pandas DataFrame – Column Order Matters" (2025) https://news.ycombinator.com/item?id=43088854 :
> [ LightGBM,] does not converge regardless of feature order.
> From https://news.ycombinator.com/item?id=41873650 :
>> Do algorithmic outputs diverge or converge given variance in sequence order of all orthogonal axes? Does it matter which order the dimensions are stated in; is the output sensitive to feature order, but does it converge regardless?
> Also, current LLMs suggest that statistical independence is entirely distinct from orthogonality, which we typically assume with high-dimensional problems. And, many statistical models do not work with non-independent features.
> Does this model work with non-independence or nonlinearity?
> Does the order of the columns in the training data CSV change the alpha of the model; does model output converge regardless of variance in the order of training data?
From https://news.ycombinator.com/item?id=37462132 :
> [ quantum discord ]
> TIL the separable states problem is considered NP-hard, and many models specify independence of observation as necessary.
How does (NP-hard) quantum separability relate to statistical independence as necessary for statistical models to be appropriate?
If it is so hard to determine which particles are and aren't entangled, when should we assume statistical independence of observation?
If we cannot assume statistical independence of observation, we know that Bayesian models aren't appropriate.
Self attention in transformer networks is decidedly not Bayesian, but self attention doesn't model truth or truthiness either.
Transformer self attention models only model frequency of observation in the sample, not truthiness.