Score: 9/10
Type: Book


At last a book that drops the entry-barrier to data science!  Foreman wrote a easy-to-follow book that enables you to do data science using Microsoft Excel. The book is rich with screenshots and step-by-step explanations on how to apply data analysis techniques.

Data Smart
Get It!


John Foreman


Kirstein, Carl

The Good:

  • teaches you the principles behind data science
  • teaches you how to apply the principles of data science (albeit with smallish data sets)
  • the cheeky writing style makes this arduous subject easy to follow and almost… fun
  • if nothing else, your excel skills will be vastly improved… even if you are a seasoned excel user
  • you will approach data in a completely new way. Other than being overwhelmed, you’ll be excited to have much data to work with
  • The Bad:

    • does not teach you much about Big Data
    • it may be too simplistic for data scientists
    • The Use:

      • useful for any manager analysing data, or being involved in the development of reports
      • useful for industrial engineers and business analysts that are involved in optimization
      • I have tried wading through a couple of highly recommended books on data science and data analysis before, but they were arduous, obscure, or both. Even the ones that claimed to do data science easily through Excel felt like swimming in peanut butter, i.e. progress is slow and exhausting. That is possibly because the books on data science are generally too technical (how to do R and hadoop, or what the mathematics behind everything is) or too philosophical (what data science can deliver instead of how it can deliver). But when I got a hold of this book the world of data science became much more accessible.

        Foreman’s approach is holding your hand while he takes you meticulously step-by-step through multiple analysis techniques in Excel, i.e. learn by application. Using Excel he employs an app that most readers will be acquainted with, and thus he also strips the obscurity of programming code. He makes use of appropriate humour, excellent narrative, and almost believable case studies to illustrate how each of the techniques could be applied. With complex methods he starts off with the core of the technique, keeping it as simple as possible, and then gradually increases the complexity (eg. the chapter on linear programming). He wrote these chapters in such a way that when the complexity gets too much, you can exit the chapter and head to the next one without losing the plot.

        The book deals primarily with optimization-, clustering-, prediction- and outlier-detection techniques. It does not comment a  great deal about using very large data sets (larger than what Excel can handle), but it does equip you to understand how you would tackle such big data sets if you had the tools able to deal with them. In the final chapter however he does discuss the migration from Excel to better suited tools (R) for large data sets. Data scientists might find this book too simplistic, or lacking in content with some subjects, but managers wanting to enter the world of data science will revel in it.

        This book is highly recommended if you want to know more about big data and data science than what you can learn through hype-generating-books.