Here are some topics. Are they considered relevant to *data science*? Matrix row...

GFK_of_xmaspast · on July 17, 2015

I've been doing data science for a while now, and for me personally:

Not really. The SVD is much more important. No. Yes. Yes. No (R-N) yes (CE). Yes. Yes. Yes. Personally, no. Only in the usage of MCMC. Yes. Yes. No. Of course. All the time. Yes. Yes. The most I'll do is remember to use the sample standard deviation. No. Yes. No. Yes. Yes. Yes. No. No. Yes. I just use a solver. See above. See above. Of course. Yes. Yes. Not privileged w/r/t/ other bases. Of course. I've never needed it. Ditto. As another tool in the toolbox. They would not be my first or second choice. Yes. No. No. Yes. No. Yes. Yes. Yes. No.

S4M · on July 16, 2015

The topics you mention are maths or applied maths topics. "Data Science" is a bubbly term that roughly means "take that big dump of data and give me some advice on how to make more money", so your list, very sadly, has little relevance with it.

graycat · on July 17, 2015

Most of those topics I listed are supposed to be good at taking data and saying how to "make more money"!

nextos · on July 17, 2015

I've seen in other threads you recommended Neveu's book to cover some probability theory topics. Care to explain whether Halmos & Rudin would be sufficient pre-requisites?

graycat · on July 17, 2015

Halmos Measure Theory is a good prerequisite to Neveu. Rudin, Principles is a bit too little. Instead, the first half, the real half of Rudin's Real and Complex Analysis is a good prerequisite. So, is Royden's Real Analysis.

Neveu is elegant beyond belief, but Breiman, Probability, the SIAM book, available in paperback, is darned good, usually easier than Neveu, less elegant, closer to applications, and without some of the special Tulcea material in the back of Neveu. K. L. Chung also has a good, comparable book. Even if want Neveu to be your main probability book, which is fine, likely you should have alternative treatments.

Of course, there is Loeve, Probability -- written in English but somehow sounding like French. It has a lot, a little too much, but I liked the topics I studied in it. It turns out, Neveu and Breiman were both Loeve students.

Halmos, Measure Theory, is darned fun to read: It has the three series theorem and a famous exercise on regular conditional probabilities.

I learned the stuff from a course by A. Karr, a star student of E. Cinlar. Karr's course was the best course of any kind I ever took in school. Powerful material, beautifully presented, each day it was a shame to erase the board.

The exercises in Neveu are usually harder than the ones in Halmos, Breiman, and Chung.

Neveu makes probability a crown jewel of civilization.

The summer after Karr's course, I sat in the library for six weeks and walked out with a 50 page manuscript that was all the research and the first draft of my dissertation. Net, probability at the level of Neveu is darned powerful stuff, makes a lot in research, and research for applications, really easy -- that is, you really know just what the heck you are doing and can knock off new results having fun sitting in bed next to your wife while she watches TV (warning -- not gender neutral!).

What I've outlined is sometimes just called graduate probability. The biggest difference is that the whole subject makes daily use of measure theory.

I don't know how much you need in probability before starting on graduate probability. In my case, graduate probability was my first serious study of probability, and I never felt that I was not prepared.

But in my career I'd done a lot of practical work in both probability and statistics -- e.g., multivariate statistics, hypothesis testing, stochastic processes, digital filtering, the fast Fourier transform, beam forming (a case of antenna theory), power spectral estimation (US Navy sonar type stuff), how to get the central limit theorem out of digital filtering, and more, random number generation, etc. That work was plenty of intuitive background for graduate probability.

But in much of that work I was struggling due to what, really, at that level, is commonly weak basic knowledge of probability. So, after those struggles, seeing graduate probability be all clean and powerful was great.

I can't advise on just how much elementary probability you might need to have enough intuition to be comfortable with graduate probability. I will say, you do need both the intuitive experience and also the solid math.

I feel sorry for people who work in prob/stat without a background in grad prob: The elementary stuff is too often just confused from poor understanding from a poor background.

The sources I mentioned above were really the first sources from which I did any real study. Net, the elementary material of prob/stat is really too simple to be taken very seriously. So, for your first serious effort, just go for graduate probability from the sources above.

The Neveu, etc., material is much of the foundation for the secret sauce of my startup.

nextos · on July 17, 2015

Thanks for the insights. Chung seems quite doable at my current level. I skimmed through it sometime ago. I borrowed a copy of Neveu and it seemed a bit harder.

Care to share other references you like. Real & complex analysis and algebra, in particular, are most welcome.

graycat · on July 18, 2015

> Real & complex analysis and algebra,

I've mentioned books I've spent at least some significant time with.

There are lots more books on my shelves that look good, have good recommendations, etc. but I haven't paid much attention to.

My interest in algebra is a bit meager -- I'm not seriously interested in number theory, algebraic geometry, algebraic topology, etc.

For real analysis, the books I mentioned seem to me to provide really good sources. Of course there is much more to analysis, e.g., functional analysis. And there's a lot to stochastic processes. And much more to math.

selimthegrim · on July 20, 2015

If you want to dip your toes in algebraic geometry and functional analysis, you could do a lot worse than Lang's book on SL(2,R) for the former and Bollobas' for the latter.

cf. http://maths-magic.ac.uk/course.php?id=339

graycat · on July 30, 2015

Thanks