How to monetize your data
Roman's Data Science
"How can you get much more out of your data without much effort?"
Buy on Amazon
KINDLE - 11.99$
Paperback 24.99$ (US)
Customer reviews (read)
Other countries: UK, DE, FR, ES, IT, JP, CA, AU
What is this book?
An introduction to the field of data analysis written in jargon-free language that is not bogged down by programming code and mathematical formulas. It covers the most essential topics in the fields of data science, machine learning, and business intelligence that you are likely to come across on a regular basis.

The main goal is to help readers get the most out of their data, make business decisions and create information products – all without paying over the odds.

In a career spanning over 20 years, the author has worked as a junior data analyst, headed up the analytics division of a $10-billion company, and co-founded a Recommender Systems startup. The text was edited by a professional journalist. It contains QR codes and links with links you can follow if you want a deeper understanding of the topics covered.
Why I decided to write the new Data Science book
Read the post here.
Who is this book for
Newcomers to data science will learn what the business needs in this area, different data analysis approaches, and an effective way to master machine learning.
Startupers will learn how to get their data science divisions up and to dash. They will also be introduced to three approaches to analyzing A/B tests.
Managers will learn about task management, money efficiency, and the hypothesis pipeline in data science.
Who's the author
Roman Zykov
Founder/Data scientist @ TopDataLab | "Roman's Data Science" book author

  • Twenty years of experience in data analysis holds a master’s degree in applied mathematics and physics.
  • Founded Retail Rocket, the leading Russian provider of e-commerce recommendation systems (SaaS) with offices in Europe and South America.
  • Created analytics from scratch for the Russian online retailer (worth $10 billion Nasdaq).
Complete author page. Email
Twitter, LinkedIn, Medium
Table of contents
Chapter 1. How We Make Decisions
  • Four Hundred Relatively Honest Ways
  • What We Can Learn from Amazon
  • Analysis Paralysis
  • Mistakes and the Caliper Rule
  • The Pareto Principle
  • Can We Make Decisions Based on Data Alone?
Chapter 2. Let’s Do Some Data Analysis
  • Data Analysis Artifacts
  • Business Intelligence Artifacts
  • Insights and Hypotheses
  • Reports, Dashboards and Metrics
  • Machine Learning Artifacts
  • Data Engineering Artifacts
  • Who Analyses Data?
  • The Perfect Button
  • Sell Analytics Internally
  • The Conflict Between the Researcher and Business
  • The Weaknesses of a Statistical Approach in Data Analysis
Chapter 3. Building Analytics from Scratch
  • Step One
  • Choosing the Tech
  • Let’s Talk about Outsourcing
  • Hiring and Firing (or Resigning)
  • Who Do Analysts Answer To?
  • Should the Head of Analytics Write Code?
  • Task Management
  • How to Get the Best Out of Daydreamers
Chapter 4. How about Some Analytical Tasks?
  • How to Set Tasks for Data Scientists
  • How to Check Tasks
  • How to Test and Introduce Changes into a Working System
  • Justifying the Task for the Originator
  • Do You Need to Know How to Code?
  • Datasets
  • Descriptive Statistics
  • Diagrams
  • A General Approach to Data Visualization
  • Pair Data Analysis
  • Technical Debt
Chapter 5. Data
  • How We Collect Data
  • Big Data
  • Data Connectivity
  • There’s No Such Thing as Too Much Data
  • Data Access
  • Data Quality
  • Checking and Monitoring Data Quality
  • Data Types
  • File Storage Formats
  • Ways to Retrieve Data
Chapter 6. Data Warehouses
  • Why Do We Need Data Warehouses?
  • Data Warehouse Layers
  • What Kinds of Data Warehouse Exist?
  • How Data Makes It Into the Warehouse
  • Hadoop and MapReduce
  • Spark
  • Optimizing Work Speed
  • Data Archiving and Obsolescence
  • Monitoring Data Warehouses
  • My Personal Experience
Chapter 7. Data Analysis Tools
  • Spreadsheets
  • Notebooks
  • Visual Analysis Tools
  • Statistical Software
  • Working With Data in the Cloud
  • What Makes a Good Reporting System?
  • Pivot Tables
  • OLAP Cubes
  • Enterprise and Self-Service BI Systems
  • My Experience
Chapter 8. Machine Learning Algorithms
  • Types of ML Problems
  • ML Task Metrics
  • ML from the Inside
  • Linear Regression
  • Logistic Regression
  • Decision Trees
  • Learning Errors
  • What to Do about Overfitting
  • Ensemble Methods
Chapter 9. The Practice of Machine Learning
  • How to Learn Machine Learning
  • ML Competitions
  • Artificial Intelligence
  • Required Data Transformations
  • The Accuracy and Cost of ML Solutions
  • The Simplicity of the Solution
  • The Amount of Work Involved in Checking the Result
  • Mechanical Turk / Yandex Toloka
  • ML and Big Data
  • Recency, Frequency and Monetary
  • Conclusion
Chapter 10. Implementing ML in Real Life: Hypotheses and Experiments
  • Hypotheses
  • Hypothesis Testing: Planning
  • What Is a Hypothesis in Statistics?
  • The Statistical Significance of Hypotheses
  • Statistical Criteria for P-Values
  • Bootstrapping
  • Bayesian Statistics
  • A/B Tests in the Real World
  • A/A Tests
  • A Few More Words about A/B Tests
  • Setting up an A/B Test
  • Experiment Pipeline
Chapter 11. Data Ethics
  • How We Are Being Watched
  • Good and Bad Data Usage
  • The Problem of Data Leakage
  • Data Ethics
  • How User Data is Protected
Chapter 12. Challenges and Startups
  • Web Analytics in Advertising
  • Internal Web Analytics
  • Database Marketing
  • Startups
  • My Personal Experience
Chapter 13. Building a Career
  • Starting Your Career
  • How to Find a Job
  • Requirements for Candidates
  • You’ve Accepted an Offer
  • Continuing Professional Development
  • How to Change Jobs
  • Do You Need to Know Everything?
Data is everywhere– from Tinder algorithms that match you with supposedly (but not really) random people, to information wars waged by politicians. It is of no surprise to anyone these days that every single thing we do is closely monitored, including your internet search history and whatever you might be up to offline too. Something catch your eye when you were passing that sports store? Just wait for the ads to start appearing on you social network pages. Tell a friend at work what your cat’s been up to and suddenly there’s dry kibble and cat litter all over your feed.

This is where the more impressionable of us might become more than just a little paranoid. But it’s not the data that’s to blame. It’s all about whose hands it falls into. There are many myths when it comes to data analysis, and “data scientist” is one of the “sexiest” and most promising professions of the future. My aim with this book is to debunk these myths and tell things how they really are. And I hope that you, the reader, will find yourself on the “light side” of the Force alongside me.

I graduated from the Moscow Institute of Physics and Technology in the early noughties before going on to head up the analytical department of the online store, where I created analytical systems from scratch. I have provided consulting services to investment funds and retail and game industry giants. Eight years ago, Ico-founded Retail, a marketing platform for online stores. During that time, we have become the undisputed market leader in Russia and have expanded our operations to Chile, the Netherlands, Spain and Germany. In 2016, I gave a guest lecture on hypothesis testing at MIT in Boston, and in 2020, I was nominated for the CDO Award.

They say it takes 10,000 hours of practice to become a master in one’s field. I’ve been doing data analysis since 2002, when it wasn’t such a hot and talked-about profession. Have Iclocked in those all-important 10,000 hours? Well, let’s do the maths: 10,000 hours / 4 hours a day / 200 days a year = 12.5 years. Looks like I’ve actually posted one and a half times this figure! Ihope this is enough to have produced a book that the reader will find useful.

This book is about how to turn data into products and solutions. It is not based on academic knowledge, but on my personal experience of data analysis over the past 20 years or so. There is no shortage of courses on data science and machine learning these days, but, as a rule, they are highly specialized. This book is different in that it does not bog the reader down with unnecessary details. Rather, it provides a big picture perspective, offering insight into:
  • data-driven decision making
  • how systems should work
  • how to test your service
  • how to combine everything into a single whole in order to create a
  • “conveyor belt” for your data output.
Look inside
Look inside available here
You can find it here.
Blog about the book
Author's blog based on the book
Book reviews
more reviews on and Goodreads
Ajit Jaokar
Course Director: Artificial Intelligence: Cloud and Edge Implementations - University of Oxford
Its a 300 page plus detailed and practical personal insights from the author on the real life implementations of large AI projects.
What you can only know by having implemented serious AI projects in real life.
Many practical things on balancing research and development, managing multiple hypotheses in the agile framework etc - things which you would normally not get to know.
Christopher Watitwa
Product Owner, General Insurance System at Turnkey Africa Ltd
This book is filled with nuggets of wisdom for students, practitioners, and enthuasists of Data Science. The author gives us a sneak peak into the experience acquired over the 20+ years in the industry laced with lessons learnt on what works and what doesn't. He doesn't get weary of reminding us that the essence of Data Science is to add value to the business.
Nick Davidov
Partner in Davidovs VC
This book is an amazing introduction into data science for everyone that leads you actually quite deep without getting unnecessary sophisticated. I think business owners and operators can benefit the most and I highly recommend this read for anyone building a startup.
Aigerim Shopenova
Data Scientist @ Rakuten
I was curious to know about how businesses earn on data solutions and how to develop a product, which is valuable for users and solve their problems. Along with that, I got a big picture of what is going on in the data science field and got some tips on how to navigate my career.