CSU2017106 Big Data Analyse - værktøjer og metoder
Big Data Analysis - tools and methods
Follow this link to view full course description
Big Data is omnipresent from industries to government and is frequently considered a completely new approach to problem solving. While the possibilities are often exaggerated, Big Data does indeed introduce new opportunities and challenges. The ability to analyse and combine large data from different sources has obvious applications, nonetheless, the lack of quality in the data combined with a high variance means that conventional analysis often fails.
This course will bring you to the forefront of the newest tools and methods based on cutting edge research and experience.
What you will learn
By completing the course you will be able to set up basic Big
Data Analysis end-to-end; from retrieving and cleaning the data, to
establishing the information level and extracting patterns and
finding outliers and to curate the necessary data.
You will get acquainted with a number of advanced tools like: Data
cleaning, statistical methods for very large datasets, data stream
analysis and finding patterns and outliers in Big Data, collecting
data from instruments and devices (i.e. internet of things) and
hardware systems design for efficient BDA.
Course Content
We will use a few structured datasets consistently throughout the course, which illustrate the commerce and will be used to demonstrate the different steps in Big Data Analysis.
Core elements:
- Data cleaning: Detecting and correcting (or removing) corrupt or inaccurate records
- Statistical methods: Robust methods for very large datasets and data with very large variance and outliers
- Finding patterns and outliers in Big Data: Which methods can be used to identify sparse patterns in very large datasets, and how to identify data that does not follow the overall pattern for a dataset?
- Collecting data from instruments and devices: How to collect, store, and analyse data from a multitude of sources that produce data (i.e. Internet-of-Things)
- Systems for Big Data Analysis: Common systems for BDA; Hadoop,
PyDisco, etc., and hardware systems design for efficient BDA.
Tools/methods introduced:
- Selected machine learning algorithms for large-scale data.
- Random forests and large-scale exact nearest neighbour search.
- Data curation: How to select data for long time curation,
systems, techniques and standards for data curation.
We will be working with several programming tools, however all
techniques that are covered are easily implemented with all
standard data-analysis languages; Python, R, etc.
Participants
The course is strictly focused on Big Data Analysis, thus a background in statistics and/or conventional data analysis is assumed. This course assumes an education at least at a Bachelor level and/or several years of data analysis experience.
Course dates
5 days, 14 – 18 August 2017, 9:00 – 16:30 at the University of Copenhagen, Frederiksberg Campus.
Course director
Troels C. Petersen, Associate Professor, Particle Physics, Niels Bohr Institute, University of Copenhagen
Other course teachers
Brian Vinter, Professor, eScience, Niels Bohr Institute, University of Copenhagen
Joachim Mathiesen, Associate Professor,
Biocomplexity, Niels Bohr Institute, University of Copenhagen
Course fee
EUR 2,600/DKK 19,000 excl. Danish VAT. Fee includes teaching, course materials and all meals during the course.
What you will learn
By completing the course you will be able to set up basic Big
Data Analysis end-to-end; from retrieving and cleaning the data, to
establishing the information level and extracting patterns and
finding outliers and to curate the necessary data.
You will get acquainted with a number of advanced tools like: Data
cleaning, statistical methods for very large datasets, data stream
analysis and finding patterns and outliers in Big Data, collecting
data from instruments and devices (i.e. internet of things) and
hardware systems design for efficient BDA.
- Kategori
- Timer
- Holdundervisning
- 40
- I alt
- 40
You can register on the course page Link to course web page
- Point
- 0 ECTS
- Prøveform
- KursusdeltagelseIngen
- Bedømmelsesform
- Ingen bedømmelse
- Censurform
- Ingen ekstern censur
Kriterier for bedømmelse
Ingen
Kursusinformation
- Sprog
- Engelsk
- Kursuskode
- CSU2017106
- Point
- 0 ECTS
- Niveau
- Master
- Varighed
- Placering
- Sommer
- Skemagruppe
- 14 - 18 August 2017
- Kursuskapacitet
- 24
- Efter- og videreuddannelse
- Pris
EUR 2,600 (DKK 19,000) excl Danish VAT
- Studienævn
- Indtægtsdækket virksomhed
Udbydende institut
- Niels Bohr Institutet
Kursusansvarlige
- Troels Christian Petersen (8-74697869767769724472666d326f7932686f)