30 Hadoop and Big Data Spelunkers Worth Following

Understanding the basic purpose of Hadoop is easy: it offers a way to quickly store, process and extract deliverable meaning(s) from vast datasets. It does this by breaking the datasets up into commodity-server-sized chunks, replicating these to reduce failure, and sending them out to a connected web (cluster) of commodity servers (nodes) . Understanding how it can integrate with the current big data landscape and may integrate with the future one is a little harder—for that, I’ve turned to the experts. Luckily for me, and for you, if you’re in my boat, many of them maintain active twitter and blogging presences. Even more luckily, the quality and clarity of writing is really, really high. The following list is by no means exhaustive, but poking into the thoughts of even a few can elucidate everything from machine learning to data modeling and distributed systems.

1.) Hilary Mason

2.) Todd Lipcon

3.) Daniel Abadi

4.) Gary Helmling

5.) Josh Patterson

6.) Chris Neuman

7.) Doug Cutting

8.) Peter Skomoroch

9.)Chris Mattman

  • bio: Adjunct Assistant Professor in the Computer Science Department within USC’s Viterbi School of Engineering. Senior Computer Scientist, Jet Propulsion Laboratory
  • twitter: @chrismattmann
  • Read: The Case for the Digital Babelfish

10.) Arun C Murthy

11.) Dmitriy Ryaboy

12.) Andrew Ferguson

13.) Chad Metcalf

  • bio: Infrastructure Operations Engineer at Cloudera
  • twitter: @metcalf

14.) Florian Leibert

15.) Jeff Darcy

16.) Ryan Rawson

17.)Alex Feinberg

18.) Johan Oskarsson

  • bio: Developer at Twitter, formerly at Last.fm.
  • twitter: @skr
  • blog: blog.oskarsson.nu

19.) Greg West

  • bio: Salesforce R&D Engineer
  • twitter: @gwestr

20.) Alexander Popescu

21.) Amr Awadallah

22.) Ted Dunning

23.) Andrew McAfee

24.) James Kobielus

25.) Amund Tveit

26.) Thomas Brox Røst

27.) Vanessa Alvarez

28.) Jeff Hammerbacher

29.) Greg Wilson

  • bio: Software Engineer at Side Effects Software, creator of Software Carpentry, author of Data Crunching: Solve Everyday Problems Using Java, Python and More
  • blog: third-bit.com/blog
  • Read: Empirical Software Engineering (co-written with Jorge Aranda)

30.) Jeff Kelly

Related Posts:

Related posts: