Bibliography

Most of the references in this book (Roman's Data Science. How to monetize your data.) are provided via hyperlinks. Over time, some of them will stop working. I have developed a mechanism to ensure that all the references remain accessible that is available at http://topdatalab.com/ref?link=[Reference number]. “Reference number” corresponds to the number of the respective reference in the text (for example, for number 23: https://topdatalab.com/ref?link=23). If I learn that a link or QR code in this book has stopped working, I will restore it as soon as possible. All the reader has to do is let me know.

1. Behave The Biology of Humans at Our Best and Worst Robert Sapolsky https://www.google.com/search?q=Behave The Biology of Humans at Our Best and Worst Robert Sapolsky
2. Amazon.com: Letter to shareholders 2015 https://s3-us-west-2.amazonaws.com/amazon.job-cms-website.paperclip.prod/shareholder_letters/2015.pdf
3. Amazon.com: Letter to shareholders 2015 https://blog.aboutamazon.com/company-news/2016-letter-to-shareholders
4. What is decision intelligence https://towardsdatascience.com/introduction-to-decision-intelligence-5d147ddab767
5. Focus on decisions not outcomes https://towardsdatascience.com/focus-on-decisions-not-outcomes-bf6e99cf5e4f
6. Russian Covid deaths three times the official toll https://www.bbc.com/news/world-europe-55474028
7. Understanding Decision Fatigue https://www.healthline.com/health/decision-fatigue
9. Building Data Science Teams. DJ Patil https://www.dropbox.com/s/9scdtqmi8k2lb5y/Building%20Data%20Science%20Teams.pdf?dl=0
10. What’s the difference between analytics and statistics? https://towardsdatascience.com/whats-the-difference-between-analytics-and-statistics-cd35d457e17
11. Debunking Narrative Fallacies with Empirically-Justified Explanations https://multithreaded.stitchfix.com/blog/2016/03/23/debunking-narrative-fallacies/
12. AB test attack: recipe 'R'+t(101)+'es46'” https://translate.google.com/translate?hl=en&sl=ru&tl=en&u=https://habr.com/ru/company/retailrocket/blog/330012/
13. Measure What Matters: How Google, Bono, and the Gates Foundation Rock the World with OKRs. Doerr John https://www.google.com/search?q=Измеряйте самое важное. Как Google, Intel и другие компании добиваются роста с помощью OKR | Дорр Джон
14. Dogs vs. Cats: Create an algorithm to distinguish dogs from cats https://www.kaggle.com/c/dogs-vs-cats
15. ResNet-50 is a convolutional neural network https://github.com/matlab-deep-learning/resnet-50
16. Data scientists mostly just do arithmetic and that’s a good thing https://m.signalvnoise.com/data-scientists-mostly-just-do-arithmetic-and-thats-a-good-thing/
17. Интервью для BBC Карл Густав Юнг, основатель аналитической психологии, 1955 год https://translate.google.com/translate?hl=en&sl=ru&tl=en&u=https://www.bbc.com/russian/features-53475033
18. The Tyranny of Metrics. Jerry Muller https://www.google.com/search?q=The Tyranny of Metrics. Jerry Muller
19. Spark/Scala Young Fighter Course https://translate.google.com/translate?hl=en&sl=ru&tl=en&u=https://habr.com/ru/company/retailrocket/blog/302828/
20. Data science management https://www.quora.com/How-do-I-move-from-data-scientist-to-data-science-management
21. You and Your Research. Richard Hamming https://www.cs.virginia.edu/~robins/YouAndYourResearch.html
22. Planning Poker https://en.wikipedia.org/wiki/Planning_poker
23. Hypothesis Testing: How to Eliminate Ideas as Soon as Possible. Roman Zykov https://recsys.acm.org/recsys16/industry-session-3/#content-tab-1-1-tab
24. Application of Kullback-Leibler divergence for short-term user interest detection https://arxiv.org/abs/1507.07382
25. Does Stylish Cross-Sell Store Need: Retail Rocket's Experience in Image Analysis for Formation of Recommendations https://translate.google.com/translate?hl=en&sl=ru&tl=en&u=https://habr.com/ru/company/retailrocket/blog/441366/
26. The most powerful idea in data science https://towardsdatascience.com/the-most-powerful-idea-in-data-science-78b9cd451e72
27. Elementary Concepts in Statistics https://docs.tibco.com/data-science/GUID-6C466605-AB68-4F81-B2BA-220BEAA05D51.html
28. Say It With Charts. Jene Zelazny https://www.google.com/search?q=Say It With Charts. Jene Zelazny
29. The Cognitive Style of Powerpoint: pitching out corrupts within. Edward R. Tafte https://www.google.com/search?q=The Cognitive Style of Powerpoint: pitching out corrupts within. Edward R. Tufte
30. On Pair Programming. Martin Fowler https://martinfowler.com/articles/on-pair-programming.html
31. Technical Debt. Martin Fowler https://martinfowler.com/bliki/TechnicalDebt.html
32. Netflix Culture https://jobs.netflix.com/culture
33. Retailrocket recommender system dataset https://www.kaggle.com/retailrocket/ecommerce-dataset
34. Making Sense of Data Warehouse Architecture https://datawarehouseinfo.com/data-warehouse-architecture/
35. Columnar database: a smart choice for data warehouses https://www.stitchdata.com/columnardatabase/
36. System and method for efficient large-scale data processing (Google) http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=/netahtml/PTO/srchnum.htm&r=1&f=G&l=50&s1=7,650,331.PN.&OS=PN/7,650,331&RS=PN/7,650,331
37. MapReduce: Simplified Data Processing on Large Clusters https://www.dropbox.com/s/azf00wnjwnqd2x8/mapreduce-osdi04.pdf?dl=0
38. The Friendship That Made Google Huge https://www.newyorker.com/magazine/2018/12/10/the-friendship-that-made-google-huge
39. Apache Hadoop https://hadoop.apache.org/
40. Apache Spark http://spark.apache.org/
41. Loader of HDFS files with combining small files on Spark https://github.com/RetailRocket/SparkMultiTool
42. Python for Data Analysis. Wes McKinney https://www.google.com/search?q=Python for Data Analysis. Wes McKinney
43. Cloudera Hadoop - Choosing and Configuring Data Compression https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/admin_data_compression_performance.html
44. Google colab https://colab.research.google.com/
45. Kaggle notebooks https://www.kaggle.com/notebooks
46. Gartner Top 10 Trends in Data and Analytics for 2020 https://www.gartner.com/smarterwithgartner/gartner-top-10-trends-in-data-and-analytics-for-2020/
47. Metabase https://www.metabase.com/
48. SuperSet https://superset.apache.org/
49. Beyond Interactive: Notebook Innovation at Netflix https://netflixtechblog.com/notebook-innovation-591ee3221233
50. What Artificial Intelligence Can and Can’t Do Right Now https://hbr.org/2016/11/what-artificial-intelligence-can-and-cant-do-right-now
51. Regression Towards Mediocrity in Hereditary Stature. Francis Galton http://www.stat.ucla.edu/~nchristo/statistics100C/history_regression.pdf
52. Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches https://arxiv.org/abs/1907.06902
53. Kaggle’s State of Data Science and Machine Learning 2019 https://www.kaggle.com/kaggle-survey-2019
54. Unity is strength — A story of model composition https://medium.com/criteo-labs/unity-is-strength-a-story-of-model-composition-49748b1f1347
55. Introduction to Machine Learning. Second Edition. Ethem Alpaydin. https://www.google.com/search?q=Introduction to Machine Learning. Second Edition. Ethem Alpaydin.
56. Scikit learn Ensemble methods https://scikit-learn.org/stable/modules/ensemble.html
57. XGBoost: Introduction to Boosted Trees https://xgboost.readthedocs.io/en/latest/tutorials/model.html
58. LightGBM https://lightgbm.readthedocs.io/
59. Catboost https://catboost.ai/
60. Andrew Ng. Machine learning Yearning https://www.deeplearning.ai/machine-learning-yearning/
61. Coursera Machine Learning https://www.coursera.org/learn/machine-learning
62. How do I learn machine learning? https://qr.ae/pN9vA4
63. Fastml4j on Scala https://github.com/rzykov/fastml4j
64. Netflix prize https://www.netflixprize.com
65. Netflix Recommendations: Beyond the 5 stars (Part 1) https://netflixtechblog.com/netflix-recommendations-beyond-the-5-stars-part-1-55838468f429
66. Andrew Gelman, Jenifer Hill “Data Analysis Using Regression and Multilevel/Hierarchical Models” https://www.dropbox.com/s/a82wwn6l74j5qka/Gelman-missing.pdf?dl=0
67. Google Course of ML: Imbalanced Data https://developers.google.com/machine-learning/data-prep/construct/sampling-splitting/imbalanced-data
68. 10 More lessons learned from building real-life Machine Learning systems https://xamat.medium.com/10-more-lessons-learned-from-building-real-life-ml-systems-part-i-b309cafc7b5e
69. ScaleFactor Raised $100 Million In A Year Then Blamed Covid-19 For Its Demise. Employees Say It Had Much Bigger Problems. https://www.forbes.com/sites/davidjeans/2020/07/20/scalefactor-raised-100-million-in-a-year-then-blamed-covid-19-for-its-demise-employees-say-it-had-much-bigger-problems/
71. DRILLING DOWN: Turning Customer Data into Profits with a Spreadsheet - Third Edition, Jim Novo https://www.google.com/search?q=DRILLING DOWN: Turning Customer Data into Profits with a Spreadsheet - Third Edition, Jim Novo
72. Google Rules of Machine Learning: Best Practices for ML Engineering https://developers.google.com/machine-learning/guides/rules-of-ml
73. Louse laser pioneer in contention for invention award https://thefishsite.com/articles/louse-laser-pioneer-in-contention-for-invention-award
74. Lox prices in city eateries could jump due to salt-water parasite https://nypost.com/2017/01/15/lox-prices-in-city-eateries-could-jump-due-to-salt-water-parasite/
75. Adidas backpedals on robotic shoe production with Speedfactory closures https://techcrunch.com/2019/11/11/adidas-backpedals-on-robotic-factories/
76. Ronald Aylmer Fisher biography https://www.adelaide.edu.au/library/special/mss/fisher/fisherbiog.pdf
77. Larry Wasserman, All of Statistics: A Concise Course in Statistical Inference (Springer Texts in Statistics), Springer (December 1, 2010) https://www.google.com/search?q=Larry Wasserman, All of Statistics: A Concise Course in Statistical Inference (Springer Texts in Statistics), Springer (December 1, 2010)
78. Nonparametric Statistics Introductory Overview - When to Use Which Method https://docs.tibco.com/data-science/GUID-1669B816-C669-4F4F-919E-231A8F3CAFDA.html
79. B.Efron, Bootstrap Methods: Another Look at the Jackknife https://doi.org/10.1214/aos/1176344552
80. Bootstrap confidence intervals https://www.dropbox.com/s/6dbqxrcocmfxyvp/MIT18_05S14_Reading24.pdf?dl=0
81. Criteo Labs: Why your A/B-test needs confidence intervals https://medium.com/criteo-labs/why-your-ab-test-needs-confidence-intervals-bec9fe18db41
82. Bayesian A/B tests https://richrelevance.com/2013/05/21/bayesian-ab-tests/
83. William Bolstard, Introduction to Bayesian Statistics https://www.google.com/search?q=William Bolstard, Introduction to Bayesian Statistics
84. Ron Kohavi, Alex Deng, Roger Longbotham, and Ya Xu. Seven Rules of Thumb for Web Site Experimenters https://exp-platform.com/rules-of-thumb/
85. Retail Rocket Segmentator https://github.com/RetailRocket/RetailRocket.Segmentator
86. Reinforcement Learning: An Introduction https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf
87. The Privacy Project https://www.nytimes.com/interactive/2019/opinion/internet-privacy-project.html
88. One Nation tracked https://www.nytimes.com/interactive/2019/12/19/opinion/location-tracking-cell-phone.html
89. Google Authorized Buyers, Real-time Bidding https://developers.google.com/authorized-buyers/rtb/start
90. Explained: Data in the Criteo Engine https://www.criteo.com/blog/explained-data-in-the-criteo-engine/
91. We Built an ‘Unbelievable’ (but Legal) Facial Recognition Machine, https://www.nytimes.com/interactive/2019/04/16/opinion/facial-recognition-new-york-city.html
92. What ISPs Can See, Upturn, March 2016 https://www.upturn.org/reports/2016/what-isps-can-see/
93. The GDPR Is a Cookie Monster https://content-na1.emarketer.com/the-gdpr-is-a-cookie-monster
94. IAB. Cookies on Mobile 101 https://www.iab.com/wp-content/uploads/2015/07/CookiesOnMobile101Final.pdf
95. How Online Shopping Makes Suckers of Us All https://www.theatlantic.com/magazine/archive/2017/05/how-online-shopping-makes-suckers-of-us-all/521448/
96. Why are the largest Russian Internet sites removing the Liveinternet counter? https://translate.google.com/translate?hl=en&sl=ru&tl=en&u=https://vc.ru/flood/1822-pochemu-krupneyshie-saytyi-runeta-ubirayut-schetchik-liveinternet
97. How To Break Anonymity of the Netflix Prize Dataset, https://arxiv.org/abs/cs/0610105
98. Alexa, are you invading my privacy? – the dark side of our voice assistants https://www.theguardian.com/technology/2019/oct/09/alexa-are-you-invading-my-privacy-the-dark-side-of-our-voice-assistants
99. LeakyPick: IoT Audio Spy Detector https://arxiv.org/abs/2007.00500
100. I SEARCH, THEREFORE I AM, Andreas Weigend https://www.dropbox.com/s/xk6w60szuq6dpeh/WeigendFOCUS2004-en.pdf?dl=0
101. We Read 150 Privacy Policies. They Were an Incomprehensible Disaster, https://www.nytimes.com/interactive/2019/06/12/opinion/facebook-google-privacy-policies.html
102. 5 Americans who used NSA facilities to spy on lovers https://www.washingtonpost.com/news/the-switch/wp/2013/09/27/5-americans-who-used-nsa-facilities-to-spy-on-lovers/
103. Pie & AI Asia: On Ethical AI with Andrew Ng https://www.deeplearning.ai/blog/pie-ai-asia-on-ethical-ai-with-andrew-ng/
104. What Do We Do About the Biases in AI? https://hbr.org/2019/10/what-do-we-do-about-the-biases-in-ai
105. Ad Blocking Growth Is Slowing Down, but Not Going Away https://www.emarketer.com/content/ad-blocking-growth-is-slowing-down-but-not-going-away
106. IAB Europe Guide to the Post Third-Party Cookie Era https://iabeurope.eu/knowledge-hub/iab-europe-guide-to-the-post-third-party-cookie-era/
107. Comparing privacy laws: GDPR v. Russian Law on Personal Data https://www.dataguidance.com/sites/default/files/gdpr_v_russia_december_2019.pdf
108. This Article Is Spying on You https://www.nytimes.com/2019/09/18/opinion/data-privacy-tracking.html
109. Functionalism: A New Approach to Web Analytics https://www.dropbox.com/s/a75hmjzekf006ia/wpaper_005.pdf?dl=0
110. Strategic Database Marketing. Arthur Hughes https://www.google.com/search?q=Strategic Database Marketing. Arthur Hughes
111. A good founder’s guide to bad VC behaviour https://technation.io/news/good-founders-guide-bad-vc-behaviour/
112. Two Decades of Recommender Systems at Amazon.com https://www.amazon.science/publications/two-decades-of-recommender-systems-at-amazon-com
113. Item-to-Item Collaborative Filtering, Greg Linden, Brent Smith, and Jeremy York https://www.dropbox.com/s/dctxbv8dk8wrsmw/Amazon-Recommendations.pdf?dl=0
114. How to use Merchandising eVars in Adobe Analytics, https://dmpg.co.uk/how-to-use-merchandising-evars-in-adobe-analytics-product-modules
116. Retail Rocket https://www.crunchbase.com/organization/retail-rocket