Data Science Technologies – Introduction to Data Science Technologies
Introduction to Data Science Technologies
Data Science Tools and Technologies
Now that you know something about what data science is, and the iterative process that data scientists follow ; it’s time to consider the technologies that you can use to explore data.
There’s a huge range of tools and technologists that you can use as a data scientist; ranging from spreadsheet tools like Microsoft Excel with through to large-scale data processing platforms like Apache Spark. This course focuses on a core set of tools that you can use to perform most data exploration, cleaning, and modeling tasks – and which you will use in the course labs. These tools are provided in the Microsoft Azure Machine Learning service, which includes:
- A graphical environment called Azure Machine Learning Studio, in which you can create data science experiments and publish machine learning models as web services.
- Jupyter Notebooks – an online tool for running custom R or Python code to explore and visualize data.
In this lesson, Steve will introduce the tools and walk through an example of their use. Don’t worry too much about the details of the example at this stage – we’ll cover data cleansing techniques and machine learning models in more detail later. For now, just focus on understanding how the tools work and how they can be used together.