Tuesday, December 17, 2013

Jobs for machine learning experts

I have compiled a list of companies hiring machine learning experts in the USA. These positions are mainly for people with my job profile, i.e., a fresh PhD graduate who has researched on machine learning. Let me know if you have any other companies to add or remove. Note that this list might get outdated fast as machine learning algorithms are being increasingly adapted by the industry.

The positions can be divided into two categories. As pointed out to me by a more experienced researcher people are seldom successful in both areas.
A) Machine learning infrastructure: Creation of models or solvers for other people to use. You write general programs that can be used by other software programmers. I personally favor this category, as you are closer to theory. Though writing applications once in a while would be interesting. Based on my own job search and feedback from others the companies that are hiring in this stream are:

  • Google
  • IBM 
  • Amazon
  • Microsoft
  • Facebook
  • Skytree (a start-up that seems promising)
  • PARC
Apart from companies there are research firms such as Bosch Research, MERL, MS research, IBM research. Adobe apparently has a lab too, though I am not sure about this. Baidu might also be worth a look.

B) Machine learning applications: There are plenty of companies looking for machine learning engineers to work on specific applications. Such positions might need more work on feature selection and parameter tuning than the previous category. Basically every big company has some ML team. The most popular applications are probably advertising, recommendations, text mining, fraud detection, search, ranking etc. Apart from the companies listed above, I would look at:

  • LinkedIn
  • Spotify
  • Bloomberg
  • Apple
  • Start-ups such as airbnb, meetup,... anything with a social component
  • Quantcast
  • Groupon
  • Rocketfuel
  • Twitter
  • eBay
  • Netflix
  • Adobe

Most positions ask for knowledge of Java, C++, Python, R, or MATLAB in decreasing order of frequency. People with experience in big data technologies (Hadoop, Hive, etc.) have several 'Data Scientist' positions available, though they mostly do not hire fresh graduates. Natural language processing and computer vision seem to be the most in demand sub-fields. A technique that is being increasingly used in the industry is deep learning, though it might be just a transient trend. 

Monday, May 30, 2011

Non-technical articles on machine learning

Noam Chomsky criticized statistical language modelling recently at a conference at MIT (Brains, Minds and Machines). Peter Norvig has given cogent reply to Chomsky's aspersions in this essay.

Here's a good article from hunch.net on research directions for ML. Search the reddit ML forum for many posts on this topic.


Online resources to study machine learning

I have been collecting information on online resources for machine learning. I share them as it will surely help several grad students. If you have some interesting ML links please do post them in the comments.

Definition
  • Many people are not clear about the boundaries (as flimsy as they may) between traditional AI, Neuroscience and Machine Learning. This introduction by Tom Michell ought to help.
Where to study from?
  • Lectures by Andrew Ng (Stanford) are said to be a good start.
  • Hundreds more of ML lectures are available at videolectures.net. These videos are good to learn more about topics that you are interested in.
  • Long list of textbooks to read (links 1 and 2). This text is free online.
How to keep up with the field?
  • ML is a fast changing field and you need to update yourself with the latest papers. The journals/conferences I found to be relevant are:
  1. JMLR
  2. ICML
  3. NIPS
  4. COLT
  5. Pattern Recognition
  6. TPAMI
  7. Machine Learning
  • I found these blogs to be useful too
  1. http://hunch.net/
  2. http://www.reddit.com/r/MachineLearning/
  3. http://metaoptimize.com/qa/
  4. http://mark.reid.name/iem
  5. CIML
Programming tools
  • Not everyone can afford Matlab. Fortunately Python (with its Numpy extension) can be used as a substitute.
  • Before implementing any standard algorithm look for online ML libraries with it. (tip: some libraries work only on *nix environments)
Machine learning has a lot in common with statistics, in fact it can be said to be derived from statistics. This link has a funny take on why ML is more popular.

Addendum (06/01/2010)
Here are some links to online lecture videos on related topics. There might be other course videos out there. Do your own search if you have time.
  • Calculus: Several course videos here. My pick
  • Linear Algebra: The famous MIT course by Gilbert Strang
  • Probability theorem: From UCLA
  • Convex optimization: Stanford lectures by Stephen Boyd. Part 2 of the course is available on youtube
  • Useful discussion on large scale machine learning