NCTU Course Syllabus in Fall Semester, 2020


Course Name:

Smart Data Analytics (SDA) I


Class time:

Friday, 13:20-14:10, 14:20-15:10, 15:30-16:20.


    A427, NCTU


Good knowledge in programming on own laptop (MAC preferred) and presentation skills in keynote or PPTX.  Basic knowledge of Applied Multivariate Statistical Analysis, like the following reference: 

Härdle WK, Simar L (2019) Applied Multivariate Statistical Analysis, 5th ed., Springer Verlag, Heidelberg.


Course content:

The evolution from analogue to digital technologies continues to dominate the attention of decision makers today. Many tools in industrial production processes have been automated or replaced by highly complex mechanisms with pre-programmed decision making. The change to digital modes of operations increasingly determines the lives of individuals and does so in increasingly unexpected ways.

The Smart Data Analytics (SDA) course presents tools and concepts for unstructured data with a strong focus on applications and implementations.  It presents decision analytics in a way that is understandable for non-mathematicians and practitioners who are confronted with day to day number crunching statistical data analysis.  All practical examples may be recalculated and modified: software and Quantlets  are in  The SDA course endows the practitioner with ready to use practical tools for smart data analytics. 

The students get insight into the area of modern internet based Computational Statistics Methods. Practically relevant knowledge on methods, data forms and Gestalt will be trained. The use of GITHUB and network techniques will be taught.  Direct computer oriented knowledge and possibilities of empirical research will be shown. We present hands on practical examples from finance, Crypto currencies and network analysis.



Data are everywhere and the ubiquitous availability of huge amounts of data makes it necessary to develop smart data analytics.  Out of the plethora of tools that are available for many scientific disciplines this course offers for the common data analyst an easy access to all levels of analysis without deep computer programming knowledge.  SDA provides a wide variety of exercises.  In addition a full set of slides is provided making it easier for the participants to reanalyze the presented material.  The R and Python programming language are becoming the lingua franca of computational data analysis.  They are the common smart data analysis software platforms used inside corporations and in academia.  Both are OS independent free open-source programs which are popularized and improved by hundreds of volunteers all over the world. The course of SDA I in the fall semester will cover Unit 1-4. The other course of SDA II in the spring semester will cover Unit 5-8.

Unit 1


What do we see?

         Basic concepts

         Data Management

         Structuring Data elements

Unit 2


Data Analysis

         Sentiment extraction

         Stemming, lemmatizing

         DTM Dynamic Topic Modeling

Unit 3


Modern Data Analysis

         Cluster Analysis and Classification

         Understanding Crypto Currencies

         CRIX a CRypto currency IndeX

Unit 4


Modern Data Analytics

         R and Python tools

         text mining and scoring

         Applications & Empirics

Unit 5


Smart Data Analytics

         Network Centrality, Herding effects

         LSTM Neural Networks

         SVMs and Probabilty of Defaults

Unit 6


Smart Data Analytics

         Financial Risk Meter


         Hierachical Clustering

Unit 7


Very Smart Data Analytics

         fraud and scam detection

         Options on cryptos

         LDA Latent Dirichlet Analysis

Unit 8


We do Smart Data Analytics

         Machine learning in Economics

         Deep Learning of Forecasts

         Generalized Random Forests


References: Evaluation:
  • Homework: 70%
  • Term Project: 30%