Introduction to Data Science Programming

Date: March-June 2023


Course Description

Data Science is transforming how companies, researchers, governments and other organisations over the world address traditional problems. A Data Scientist is a highly skilled professional, who is able to combine state-of-the-art computer science techniques with modern mathematical and statistical methods to extract understanding from data and create new knowledge-based services. The job market is currently in shortage of trained professionals with this set of skills and the demand is expected to increase significantly over the next years.

QHP4701 Introduction to Data Science Programming is designed to help students start developing the computer science skills that they will need to successfully manage complex Data Science projects. We will be using Python environments and powerful libraries to represent, process and visualise different types of data. In future modules, you will be using and developing further all the skills that you will acquire in QHP4701 Introduction to Data Science Programming.

Lecture Week 1 - Introduction to Data Science and Python

Overview

1.1 Introduction to Data Science and Python

This session covers an introduction to Data, Data Science and Python. We also cover the 'Anaconda distribution' and different interfaces it has for python.

PDF
1.2 Getting Started with Python: Lab Session

This is a worksheet that covers the instructions on installing Anaconda, Python Interfaces and a few start-up activities to work with.

PDF
1.3 Collection of data

This session covers the data types in python that are used for representing collection(s) to Data. They are List, Tuple, Sets, and Dictionary.

PDF
1.4 Collection of Data: Lab Session

This worksheet (Jupyter-Notebook) covers the Collection(s) of Data, specifically, list, tuple, sets, and dictionary. Download this zip folder and, extract it in a folder, and open jupyter-notebook in your jupyter-notebook interface. Note that 'images' folder should be in same folder as this notebook to display all the required figures for explanation.

PDF

Lecture Week 2 - Arrays in Numpy and visualisation in Matplotlib

Overview

2.1 Vectors, Matrices, and Numpy Arrays

This session starts with a brief introduction of for-loop and its applications in linear algebra, followed by a details of NumPy Arrays.

PDF
2.2 More on Numpy Arrays

This session continues on Numpy Arrays.

PDF
2.3 Loop and NumPy: Lab Session

This worksheet (Jupyter-Notebook) covers the Loop (specifically for-loop) and NumPy. Download this zip folder and, extract it in a folder, and open jupyter-notebook in your jupyter-notebook interface. Note that 'data' folder should be in same folder which includes some of the data files you would need.

PDF

Lecture Week 3 - Program Development and visualisation tools

Overview

3.1 Control Flow: Program Development

This session covers the control flow tools such as if-else, nested loops and interruptions

PDF
3. 2 Control Flow: Lab Session

This worksheet (Jupyter-Notebook) covers the Control Flow Tools such as if-else, Boolean operators and for-loop/while-loop.

ZIP-Jupyter
3.3 Function: Program Development

This session covers a more on control flow tools and details of Function.

PDF
3.4 Visualisation with Matplotlib

This session covers visualisation of data using Matplotlib library

PDF

Lecture Week 4 - Data and File Handling

Overview

4.1 Data Handling with Pandas

This session covers the Data Handling using Pandas library.

PDF
4.2 Function, Visualisation, and Pandas : Lab Session

This worksheet (Jupyter-Notebook) covers the tasks related to functions, visualisations using matplotlib and data handling using pandas.

ZIP-Jupyter
4.3 Error Handling

This session includes the Error Handling in Python.

PDF
4.4 Error Handling: Lab Session

This worksheet covers the error handling and docstring.

ZIP-Jupyter

Lecture Week 5 - File Handling and Conclusion

Overview

5.1 More on File Handling

This session covers more on file handling, specifically, text files, numpy files and pickle files. We will also have time for doubts and questions about any topics that we have covered so far.

PDF

Lab Weeks - Practical sessions

Overview

1.4 Worksheet

This session covers the worksheet on python.

Jupyter-notebook
2.3 Worksheet

This session covers the worksheet on python.

Jupyter-notebook