€100

BI and Analytics with SQL Server and R

Event Information

Share this event

Date and Time

Location

Location

Microsoft Auditorium, EPDC2

South County Business Park

Leopardstown

Dublin (Luas Stop: Central Park), Dublin 18

View Map

Friends Who Are Going
Event description

Description

Abstract:

Workshop brings introduction to R language and R programming for data scientist, for purposes of data wrangling, data preparation, statistical analysis and predictive analytics. Discovering basics of statistics and multivariate statistics and covering most common statistical approaches will provide a good ground for advanced algorithms and predictive analytics. Workshop will also cover the guidelines how to work on larger datasets and how to work in conjunction with relational database (SQL Server) for distributed and parallel high performance computation.


Objectives:
Get to know R Language and learn how to use R for data analysis and predictive analytics for typical daily data science encounters and challenges. Explore the variety of algorithms and methods for statistical analysis, predictive analytics and walk away with knowledge on how to use them, where and when to use them. Explore how to tackle datasets from different source, SQL Server, text files, web and social media. You will also see how to tackle performance issues when dealing with large datasets and how to get very good performance and scalability on your on-premises server.


Content and Topics covered in the workshop:
Module 1: 9.00 – 10.45 Introducing the basics of R Language and R programming. We will be covering basics of the R language, environment, data types to programming in R, and data wrangling, munging, scrapping, and data understanding.

Break 1: 10.45 – 11.00

Module 2: 11.00 – 12.30 Understanding basic and multivariate statistics for data scientists. Module will focus on exploring basic data understanding, bi and multivariate statistics, covering from correlations, regression, analysis of variance, Factor analysis, canonical and discriminant analysis.

Lunch: 12.30 – 14.00

Module 3: 14.00 – 15.30 Predictive algorithms explained and working with datasets for predictions in R Language. Covering commonly used predictive algorithms, which every data scientist can use with their daily work. Module will cover decision trees, gradient boosting with regression, Naïve bayes, clustering and time series.

Break 2: 15.30 – 15.45

Module 4: 15.45 – 17.00 Exploring RevoScaleR R package for parallel and distributive computation for large datasets within SQL Server 2016. With SQL Server 2016 comes RevoScaleR package that addressed several limitations for working on larger dataset, especially in corporate environment. Several typical R limitations are addressed and removed making predictive analysis against large datasets and databases extremely performance efficient. Module will bring insights and knowledge on how to use this package with large datasets.

Break 3: 17.00 – 17.15

Module 5: 17.15 – 18.00 Story telling. Cases from the field; using CRM data and predictions within SQL Server, using basket analysis, customer classification, performance analysis for DBA and system administrators, building recommender system.

Targeted audience:

Audience who would like to refresh or deepen the knowledge on data science, R Language and working with data for data reports or data visualizations. We welcome everyone working with data (data wrangles, data analysts), working with reports and marketing campaigns (business and marketing people) and working on data analysis (data analysts, data scientist). If your daily work revolves around data and you are curious to learn more on R Language, statistics or predictive analytic, join us.

Required/Suggested Materials and Software:

You are very welcome to bring your own laptop and work along with trainer.

Have R engine (link: www.project-r.org/download) and R Studio (link: www.rstudio.com/download) preinstalled. Optional for Module 4, have a SQL Server 2016 with R installed (enterprise or developer edition) or later version (2017).

Any accompanying materials, code and PPT will be handed over at the workshop.


About the Trainer

Tomaž Kaštrun is a SQL Server developer, BI and data analyst. He has more than 15 years of experience in business warehousing, development, ETL, database administration and query tuning. He also has more than 15 years of experience in the fields of data analysis, data mining, statistical research, and machine learning.

He is a Microsoft SQL Server data platform MVP and has been working with Microsoft SQL Server since version 2000.

Links
Twitter: https://twitter.com/tomaz_tsql
Technical Blog: https://tomaztsql.wordpress.com/
Public email:
tomaz.kastrun@gmail.com

Technical Articles/Books:

Share with friends

Date and Time

Location

Microsoft Auditorium, EPDC2

South County Business Park

Leopardstown

Dublin (Luas Stop: Central Park), Dublin 18

View Map

Save This Event

Event Saved