Python Machine Learning

Sebastian Rashka

Published by

Packt Publishing

ISBN

978-1783555130

RRP

£28.99

Reviewed by

Patrick Hill CEng MBCS CITP

Score

10 out of 10

There is significant interest, across many diverse application domains, in developing applications that exploit the large and rich data sets that are acquired by computer systems in order to identify trends and correlations and to make predictions and recommendations. These kinds of applications broadly fit into the field of machine learning (ML).

This excellent book is a practical introduction to ML using the Python programming language, along with relevant components of Python’s rich open-source ecosystem of machine learning, numerical computing, graph plotting and other cognate libraries.

The author does not assume any prior ML knowledge, though a basic understanding of the Python language, linear algebra and calculus is useful. An introductory chapter introduces basic machine learning terminology, describing the various types of machine learning, and overviewing the key steps of typical machine learning processes. This introduction also discusses, in general terms, how to install and configure Python and the various libraries required by the remainder of the book.

As may be expected, a significant part of the book is given to the discussion of a variety of different learning algorithms, including the mathematical details of how they work, how they are trained and how they perform against other algorithms. Crucially, the book describes how to use library implementations of each algorithm.

While ML algorithms play a central part in any machine learning application, there are a variety of other issues involved in making these algorithms work effectively in any particular application. The book therefore contains comprehensive discussion of key topics such as data exploration, visualization and preparation, techniques for generating test and validation data sets, algorithm training and evaluation.

The author describes appropriate tools and techniques that can assist with these processes. To help put ML in context, a chapter is also devoted to describing a simple but complete online ML system embedded inside a web application.

The book covers a considerable amount of material over its four hundred pages. Rather than being a dry exposition of ML techniques, the author encourages readers to engage with the text by experimenting with the examples. Naturally, good data sets are required and the book provides links to a variety of real-world data sets that are freely available for download. In addition, the source code for the examples may be downloaded from the book’s website.

Open-source libraries are used extensively, providing efficient implementations and enabling the reader to focus on the machine learning application itself rather than getting bogged down with implementational detail. The text contains lots of practical advice as well as inline links to other sources of information. While this book is a good end-to-end read, it would also serve as a useful reference text. Recommended.

Further information: Packt Publishing

January 2016