One of my favorite parts of data science is communicating findings in a way that is compelling and easy to understand. That is why, for my investigation at Metis, I decided to learn how to use Tableau, which provides data analysis software for business intelligence.
There are some wonderful resources available to learn Tableau, many of which I used while I was learning how to use the software. This blog post is intended to provide a gentle introduction to Tableau for beginners, demonstrating some of the functions I have found most useful in my projects.
I conducted some initial data cleaning in a Jupyter Notebook using Pandas, primarily to remove null values from the dataset. I also removed a few Property Types which I deemed to be irrelevant, such as ‘Hut’ and ‘Tent.’ Following these steps, I had a dataset of 21,980 listings across New York’s five boroughs.
Loading the dataset
Tableau can connect to multiple data sources, such as Microsoft Excel, Text files, and even databases. As the AirBnb dataset is an Excel file, we will select ‘Microsoft Excel’ to load the file.
The loaded dataset can then be viewed at the ‘Data Source’ tab:
Navigating the Tableau interface
Although learning a new tool can seem intimidating, the Tableau interface is intuitive and user-friendly. The figure below highlights a few of the most important fields that can be used to navigate the interface.
- Data and Analytics panes: The Data pane lists the data that is loaded, separated into Dimensions (categorical) and Measures (numerical). In the Analytics pane, you can add summary statistics and models to the visualization.
- Pages, Filters and Marks Shelves: The Pages shelf is used to break a view into a series of pages to analyze how a specific field affects the rest of the data in a view. The Filters shelf is used to apply filters to the data. The Marks shelf contains functionalities related to coloring, size, labels, etc.
- Columns and Rows: Indicates the features that have been added to the workspace.
- Show Me: This drop down allows you to select the type of visualization you would like to create, and provides guidance for the inputs required to create each type of visualization.
- Sheet navigation ribbon: Here you can select ‘Data Source’ to view the data, navigate through existing worksheets and dashboards or create new worksheets and dashboards. If you are familiar with Excel, you will notice that this is very similar!
Creating calculated fields
One of the Tableau functionalities that I really enjoy using is creating calculated fields based on the available features. As the AirBnb dataset provides Price and Beds data for each listing, we can use these features to create a new feature, Price per bedroom. To do this, right click on the Price column, and select ‘Create Calculated Field.’ A pop up will appear, in which we can populate the details of the field we want to calculate, shown in the figure below:
Notice the bottom left of the image says ‘The calculation is valid.’ In this corner of the box, Tableau provides guidance on errors in the calculation which should be corrected before it will accept the calculated field.
Tableau provides many awesome functions which you can use to create calculated fields. You can learn more about them here.
Creating our first figure
Create a new worksheet by selecting the ‘New Worksheet’ icon in the bottom left of the interface.
For this figure, we will look at the average price per room by zip code. To do this, drag the zip code feature to the main window. As you see in the figure below, Tableau will automatically recognize this as a geolocation tag and will populate longitude and latitude in the Columns and Rows fields.
To add the encoding for average price per room, drag the ‘Price per room’ feature we created into the ‘Marks’ shelf. Then click the icon on the left of the Price per room pill to change it to ‘Color’. Also note that Tableau will automatically sum this feature. To change it to an average, click the drop down arrow on the right of the pill, scroll down to ‘Measure’ and select ‘Average’.
Next, we can add a heading to indicate the key takeaway of the figure so it is easy for the reader to quickly understand the main point of the figure. I will add, “Manhattan zip codes have the highest average price per room.” To do this, double click the ‘Heading’ field at the top of the figure and a pop up should appear where you can edit the content. The final version of the figure can be seen in the image below.
Other tips and tricks
While this is meant to be a gentle introduction to Tableau for beginners, there are a few additional tricks I have learned that I have found to be useful:
- Creating scatter plots: When creating a scatter plot, be sure to turn off ‘Aggregate Measures’ in the ‘Analysis’ tab, otherwise all of the data points will stack on top of each other.
- Creating a dashboard: You can create a dashboard using the ‘New Dashboard’ tab in the button ribbon. Learn more about styling dashboards here.
- Saving your workbook: Your work will be saved to Tableau Public. When you do this, it is important to be aware that anyone who views the workbook can download the underlying data, unless you change the settings. If you are concerned about potential data privacy issues, you can turn off sharing by selecting ‘Edit details’ in the green ribbon at the top of Tableau Public. Then deselect ‘Allow others to download or explore and copy this workbook and its data.’