0. Getting started
Installation of software.
Anaconda
It is recommended to install Python via the Anaconda Distribution. We will use the Conda Package Management System included in the Anaconda Distribution. From the documentation:
Conda is an open source package management system and environment management system that runs on Windows, macOS and Linux. Conda quickly installs, runs and updates packages and their dependencies. Conda easily creates, saves, loads and switches between environments on your local computer.
After installing Anaconda, run python --version
in a terminal (if you're on Windows, use the "Anaconda Prompt"). If the output contains "Python 3.8" and then you're ready for the next step.
GitHub
The course material is hosted on the code-sharing platform GitHub (where you're currently reading this). If you're not already registered at GitHub, make a user account now: https://github.com/join. It is recommended to use the platform for your own projects during the course. As student, you can apply for the GitHub Student Developer Pack, which includes offers and benefits from GitHub partners: https://education.github.com/students.
Kaggle
Kaggle is an online community of "data scientists", arranging data science competitions and hosting a large number of data sets. We'll make use of Kaggle in DAT158, both for course projects and as a source of data. Make an account here: https://www.kaggle.com. o
Install Git
Check if Git is already installed:
git --version
If Git is not installed, you will receive an error message similar to the following:
-bash: git: command not found
'git' is not recognized as an internal or external command, operable program or batch file.
In this case, run the following command:
conda install git
Test your installation
Go through the notebook notebooks/0.0-test.ipynb
:
jupyter notebook
You can alternatively use JupyterLab:
jupyter lab
The following video gives an introduction to Jupyter Notebook:
Troubleshooting
- If you're using GNU/Linux or MacOS and the
conda activate dat158
command fails, runsource ~/.bash_profile
and try again. - If you're on a Mac and the
conda env update
command fails with agcc
error, install Xcode through the App store and use it to install command line tools.
This class is supported by DataCamp. Here you will find several short courses with expert videos and hands-on-the-keyboard exercises that can be used as a supplement to DAT158. You get free access to all DataCamp content throughout the semester if you register through this link with your student mail (@stud.hvl.no).