I've been doing data science for a while now, and for me personally:
Not really.
The SVD is much more important.
No.
Yes.
Yes.
No (R-N) yes (CE).
Yes.
Yes.
Yes.
Personally, no.
Only in the usage of MCMC.
Yes.
Yes.
No.
Of course.
All the time.
Yes.
Yes.
The most I'll do is remember to use the sample standard deviation.
No.
Yes.
No.
Yes.
Yes.
Yes.
No.
No.
Yes.
I just use a solver.
See above.
See above.
Of course.
Yes.
Yes.
Not privileged w/r/t/ other bases.
Of course.
I've never needed it.
Ditto.
As another tool in the toolbox.
They would not be my first or second choice.
Yes.
No.
No.
Yes.
No.
Yes.
Yes.
Yes.
No.
The topics you mention are maths or applied maths topics. "Data Science" is a bubbly term that roughly means "take that big dump of data and give me some advice on how to make more money", so your list, very sadly, has little relevance with it.
I've seen in other threads you recommended Neveu's book to cover some probability theory topics. Care to explain whether Halmos & Rudin would be sufficient pre-requisites?
Halmos Measure Theory is a good
prerequisite to Neveu. Rudin,
Principles is a bit too little.
Instead, the first half, the real
half of Rudin's Real and Complex
Analysis is a good prerequisite.
So, is Royden's Real Analysis.
Neveu is elegant beyond belief,
but Breiman, Probability, the SIAM
book, available in paperback, is
darned good, usually easier
than Neveu, less elegant, closer to
applications, and without some of the
special Tulcea material in the back
of Neveu. K. L. Chung also has a
good, comparable book. Even if want
Neveu to be your main probability book,
which is fine, likely you should have
alternative treatments.
Of course, there is Loeve, Probability
-- written in English but somehow
sounding like French. It has a lot,
a little too much, but I liked the
topics I studied in it. It turns out,
Neveu and Breiman were both Loeve
students.
Halmos, Measure Theory,
is darned fun to read:
It has the three series theorem
and a famous exercise on regular
conditional probabilities.
I learned the stuff from a course
by A. Karr, a star student of
E. Cinlar. Karr's course was the
best course of any kind I ever took
in school. Powerful material,
beautifully presented, each day it
was a shame to erase the board.
The exercises in Neveu are usually
harder than the ones in Halmos,
Breiman, and Chung.
Neveu makes probability a crown jewel
of civilization.
The summer after Karr's course,
I sat in the library for six weeks
and walked out with a 50 page
manuscript that was all the research
and the first draft of my dissertation.
Net, probability at the level of Neveu
is darned powerful stuff, makes a lot
in research, and research for applications,
really easy -- that is, you really
know just what the heck you are doing
and can knock off new results
having fun sitting in bed next to
your wife while she watches TV
(warning -- not gender neutral!).
What I've outlined is sometimes
just called graduate probability.
The biggest difference is that the
whole subject makes daily use
of measure theory.
I don't know how much you need in
probability before starting on
graduate probability. In my case,
graduate probability was my
first serious study of probability,
and I never felt that I was not prepared.
But in my career I'd done a lot of
practical work in both probability
and statistics -- e.g., multivariate
statistics, hypothesis testing,
stochastic processes, digital filtering,
the fast Fourier transform, beam forming
(a case of antenna theory),
power spectral estimation (US Navy
sonar type stuff), how to get the
central limit theorem out of
digital filtering,
and more, random number generation, etc.
That
work was plenty of intuitive background
for graduate probability.
But in much of that work I
was struggling due to what, really,
at that level, is commonly
weak basic knowledge of probability.
So, after those struggles, seeing
graduate probability be all
clean and powerful was great.
I can't advise on just how much
elementary probability you might
need to have enough intuition
to be comfortable with graduate
probability. I will say, you do
need both the intuitive experience
and also the solid math.
I feel sorry for people who work
in prob/stat without a background
in grad prob: The elementary
stuff is too often just confused
from poor understanding from
a poor background.
The sources I mentioned above were really
the first sources from which I did
any real study. Net, the elementary
material of prob/stat is really too simple
to be taken very seriously. So, for your
first serious effort, just go for
graduate probability from the sources
above.
The Neveu, etc., material is much of the
foundation for the secret sauce
of my startup.
Thanks for the insights. Chung seems quite doable at my current level. I skimmed through it sometime ago. I borrowed a copy of Neveu and it seemed a bit harder.
Care to share other references you like. Real & complex analysis and algebra, in particular, are most welcome.
I've mentioned books I've spent at least
some significant time with.
There are lots more books on my shelves
that look good, have good recommendations,
etc. but I haven't paid much attention to.
My interest in algebra is a bit meager --
I'm not seriously interested in number
theory, algebraic geometry, algebraic
topology, etc.
For real analysis, the books I mentioned
seem to me to provide really good sources.
Of course there is much more to analysis,
e.g., functional analysis. And there's
a lot to stochastic processes. And much
more to math.
If you want to dip your toes in algebraic geometry and functional analysis, you could do a lot worse than Lang's book on SL(2,R) for the former and Bollobas' for the latter.
Matrix row rank and column rank are equal.
In matrix theory, the polar decomposition.
Each Hermitian matrix has an orthogonal basis of eigenvectors.
Weak law of large numbers.
Strong law of large numbers.
The Radon-Nikodym theorem and conditional expectation.
Sample mean and variance are sufficient statistics for independent, identically distributed samples from a univariate Gaussian distribution.
The Neyman-Pearson lemma.
The Cramer-Rao lower bound.
The margingale convergence theorem.
Convergence results of Markov chains.
Markov processes in continuous time.
The law of the iterated logarithm.
The Lindeberg-Feller version of the central limit theorem.
The normal equations of linear regression analysis.
Non-parametric statistical hypothesis tests.
Power spectral estimation of second order, stationary stochastic processes.
Resampling plans.
Unbiased estimation.
Minimum variance estimation.
Maximum likelihood estimation.
Uniform minimum variance unbiased estimation.
Wiener filtering.
Kalman filtering.
Autoregressive moving average (ARMA) processes.
Rank statistics are always sufficient.
Farkas lemma.
Minimum spanning trees on directed graphs.
The simplex algorithm of linear programming.
Column generation in linear programming (Gilmore-Gomory).
The simplex algorithm for min cost capacitated network flows.
conjugate gradients.
The Kuhn-Tucker conditions.
Constraint qualifications for the Kuhn-Tucker conditions.
Fourier series.
The Fourier transform.
Hilbert space.
Banach space.
Quasi-Newton iteration and updates, e.g., Broyden-Fletcher-Goldfarb-Shanno.
Orthogonal polynomials for numerically stable polynomial curve fitting.
Lagrange multipliers.
The Pontryagin maximum principle.
Quadratic programming.
Convex programming.
Multi-objective programming.
Integer linear programming.
Deterministic dynamic programming.
Stochastic dynamic programming.
The linear-quadratic-Gaussian case of dynamic programming.