ebook ebooks e-book e-books downloaden bei MyEbooks.ch downloaden

Machine Learning Using R

:	Karthik Ramasubramanian, Abhishek Singh
:	Machine Learning Using R
:	Apress
:	9781484223345
:	1
:	CHF 38.00
:

:	Informatik
:	English

:	580
:	Wasserzeichen/DRM
:	PC/MAC/eReader/Tablet
:	PDF

This book is inspired by the Machine Learning Model Building Process Flow, which provides the reader the ability to understand a ML algorithm and apply the entire process of building a ML model from the raw data.

This new paradigm of teaching Machine Learning will bring about a radical change in perception for many of those who think this subject is difficult to learn. Though theory sometimes looks difficult, especially when there is heavy mathematics involved, the seamless flow from the theoretical aspects to example-driven learning provided inBlockchain and Capitalism makes it easy for someone to connect the dots.

For every Machine Learning algorithm covered in this book, a 3-D approach of theory, case-study and practice will be given. And where appropriate, the mathematics will be explained through visualization in R.

All practical demonstrations will be explored in R, a powerful programming language and software environment for statistical computing and graphics. The various packages and methods available in R will be used to explain the topics. In the end, readers will learn some of the latest technological advancements in building a scalable machine learning model with Big Data.

Who This Book is For:

What you will learn:

1.ML model building process flow

2.Theoretical aspects of Machine Learning

3.Industry based Case-Study

4.Exampl based understanding of ML algorithm using R

5.Building ML models using Apache Hadoop and Spark

Karthik Ramasubramanian, works for one of the largest and fastest growing technology unicorn in India, Hike Messenger. He brings the best of Business Analytics and Data Science experience to his role at Hike Messenger. In his 7 years of research and industry experience, he has worked on cross-industry data science problems in retail, e-commerce, and technology, developing and prototyping data driven solutions. In his previous role at Snapdeal, one of the largest e-commerce retailer in India, he was leading core statistical modelling initiatives for customer growth and pricing analytics. Prior to Snapdeal, he was part of central database team, managing the data warehouses for global business applications of Reckitt Benckiser (RB). He has rich experience working with scalable machine learning solutions for industry, including sophisticated graph network and self-learning neural networks. He has a Masters in Theoretical Computer Science from PSG College of Technology, Anna University and certified big data professional. He is passionate about teaching and mentoring future data scientist through different online and public forums. He enjoys writing poems in his leisure time and an avid traveler.

< div>

bhishek Singh, is a Data Scientist in Advanced Data Science team of Prudential Financial Inc., second largest Life Insurance Provider in US, and is based out of Ireland. He have 5 years of professional and academic experiene in Data Science, spanning across consulting,teaching and financial services. At Deloitte Advisory, he was leading Risk Analytics initiatives for top US banks in their regulatory risk, credit risk, and balance sheet modelling requirements. In his current role, he is working on scalable machine learning algorithms for Indiavidual Life Insurance business of Prudential. He have working experience in time series models and has worked with cross functional teams to implement data science solutions in enterprise infrastructure. He has been active trainer at Deloitte Professional University and had led training and development initiatives for professionals in the area of statistics, economics, financial risk and data science tools (SAS and R). He is a B.Tech. in Mathematics and Computing from Indian Institute of Technology, Guwahati and an MBA from Indian Institute of Management, Bangalore. He speaks in public events on Data Science and working with leading universities towards bringing data science skills to graduates. He have keen interest in Law and holds a Post Graduate Diploma in Cyber Law from NALSAR University. He enjoys cooking and photography during his free hours.

div>

	Contents at a Glance	5
	Contents	6
	About the Authors	17
	About the Technical Reviewer	19
	Acknowledgments	20
	Chapter 1: Introduction to Machine Learning and R	21
	1.1 Understanding the Evolution	22
	1.1.1 Statistical Learning	22
	1.1.2 Machine Learning (ML)	23
	1.1.3 Artificial Intelligence (AI)	23
	1.1.4 Data Mining	24
	1.1.5 Data Science	25
	1.2 Probability and Statistics	26
	1.2.1 Counting and Probability Definition	27
	1.2.2 Events and Relationships	29
	1.2.2.1 Independent Events	29
	1.2.2.2 Conditional Independence	30
	1.2.2.3 Bayes Theorem	30
	1.2.3 Randomness, Probability, and Distributions	32
	1.2.4 Confidence Interval and Hypothesis Testing	33
	1.2.4.1 Confidence Interval	34
	1.2.4.2 Hypothesis Testing	35
	1.3 Getting Started with R	38
	1.3.1 Basic Building Blocks	38
	1.3.1.1 Calculations	38
	1.3.1.2 Statistics with R	39
	1.3.1.3 Packages	39
	1.3.2 Data Structures in R	39
	1.3.2.1 Vectors	40
	1.3.2.2 List	40
	1.3.2.3 Matrix	40
	1.3.2.4 Data Frame	41
	1.3.3 Subsetting	41
	1.3.3.1 Vectors	41
	1.3.3.2 Lists	42
	1.3.3.3 Matrixes	42
	1.3.3.4 Data Frames	43
	1.3.4 Functions and Apply Family	43
	1.4 Machine Learning Process Flow	46
	1.4.1 Plan	46
	1.4.2 Explore	46
	1.4.3 Build	47
	1.4.4 Evaluate	47
	1.5 Other Technologies	48
	1.6 Summary	48
	1.7 References	48
	Chapter 2: Data Preparation and Exploration	50
	2.1 Planning the Gathering of Data	51
	2.1.1 Variables Types	51
	2.1.1.1 Categorical Variables	51
	2.1.1.2 Continuous Variables	52
	2.1.2 Data Formats	52
	2.1.2.1 Comma-Separated Values	53
	2.1.2.2 Microsoft Excel	53
	2.1.2.3 Extensible Markup Language: XML	53
	2.1.2.4 Hypertext Markup Language: HTML	55
	2.1.2.5 JSON	57
	2.1.2.6 Other Formats	59
	2.1.3 Data Sources	59
	2.1.3.1 Structured	59
	2.1.3.2 Semi-Structured	59
	2.1.3.3 Unstructured	59
	2.2 Initial Data Analysis (IDA)	60
	2.2.1 Discerning a First Look	60
	2.2.1.1 Function str()	60
	2.2.1.2 Naming Convention: make.names()	61
	2.2.1.3 Table(): Pattern or Trend	62
	2.2.2 Organizing Multiple Sources of Data into One	62
	2.2.2.1 Merge and dplyr Joins	62
	2.2.2.1.1 Using merge	63
	2.2.2.1.2 dplyr	64
	2.2.3 Cleaning the Data	65
	2.2.3.1 Correcting Factor Variables	65
	2.2.3.2 Dealing with NAs	66
	2.2.3.3 Dealing with Dates and Times	67
	2.2.3.3.1 Time Zone	68
	2.2.3.3.2 Daylight Savings Time	68
	2.2.4 Supplementing with More Information	68<