Data Portfolio
Hi, I'm Cassia
I'm a 4th year Graphic Communication Major Concentrating in UX/UI. I recently took GrC 404 (Data Management, Estimating and Visualization in Graphic Communication) and have compiled this portfolio to showcase the work I have done throughout the quarter.


Final Data Project In Tableau Prep and Desktop
Using data found in FiveThirtyEight’s Github Repositories I cleaned and prepped data in Tableau Prep and then built out a story composed of multiple dashboards and graphs to examine racial disparities and political influences in hate crimes and police killings in 2015 with Tableau Desktop.
Parameters – filtering by top and bottom variables.
Filter by Map – used the map as a filter to showcase more raw data that was used throughout the story to be seen independently.
Groups – grouped states by who they voted for to deepen the original data surrounding the share of voters who voted for Trump.

Tableau Analysis of San Mateo Airbnb Pricing
Given the data set for San Mateo Count’s AirBnB pricing, I created a story and multiple charts that explore the dataset.
Parameters – creating custom parameters to add an interactive filter where the top amount of a dimension being shown can be controlled.
Dual-axis charts – comparing two measures under a single dimension to see their relationship to one another, synchronizing axes when they are on the same level of measurement.
Story – creating a multi slide story that layers graphs on top of images for more enriched data visualization.
Grouping – making groups to find different relationships within the same data or to turn text data to being able to be controlled numerically.

Tableau Finding Correlation
Using the collected data from a large long-term student survey, I played with finding correlations between data and drawing conclusions and presenting them.
Connecting data sources – creating a relationship between pages of a workbook to establish the foundation of the data population into Tableau.
Bar graph – displaying information with as little clutter as possible using direct labeling.

Tableau Prep Clean & Join
Using Cal Poly CIP, Admissions, and Fee data, I built out a flow a Tableau Prep flow with two outputs.
Join – used an inner join to merge columns (fees to the admission data) consolidating the data from admissions and fees into one unified dataset,
Pivot – swapped columns to rows through years.
Cleaning - Removed duplicates, fixed null values, standardized field names, and reformatted data types to get rid of extraneous information.
Union - Union to merge rows of CIP data through years.
Output – outputting two different flows to populate as independent sheets within one excel file.

Tableau Dashboard
Using built in Tableau Desktop World indicators data, I built out line, bar, and map graphs and compiled them into a dashboard.
Basics – getting a hang of basic Tableau functions such as understanding measures, dimensions, marks, filters, columns, rows, and how to navigate the “Show me” flyout menu to best visualize the data I am displaying.
Line graph – adjusting the format of axes, use of color, direct labeling, and lines to declutter and reduce unnecessary data.
Bar Graph – comparing two measures under one dimension.
Map – Color coding the map based on a calculated field and applying a filter by year.
Tiling – understanding the difference between tiled and float dashboards for cleanliness and adaptability.

Data Visualization Exercises
Completed exercises from Cole Nussbaumer’s “Storytelling with Data: Let’s Practice” workbook that were aimed at following basic design principles.
Gestalt Principles – usage of proximity, similarity, enclosure, closure, continuity, and connection to quickly interpret patterns, relationships, and trends in the information
Decluttering – removal of unnecessary elements, direct labeling, and more sparing use of color to focus on the story.
Storytelling – using the three important elements of narrative, data and visuals to explain, enlighten, and engage readers.

Google Analytics
Using the demo of the Google Merchandise store, I found information based on specific date ranges finding pageviews to monitor engagement, users to understand unique interactions, revenue to track earnings, click through rate to know how often users are clicking on an ad compared to how often it is being show, cost per click to know how much is being paid per ad, and return on ad spend to know how much is being made per dollar spent on advertising

ERDs in Caspio
Using different problem scenarios to build out Entity Relationship Diagrams.
Primary Keys - Unique identifiers for each record in an entity
Foreign Keys - Fields that create relationships between two entities by referencing the primary key in another table
Timestamps - Track when records are created or updated

Pivot Dashboards and Pivot Tables & Pivot Charts
Creating a dashboard of collected data from a large long-term student survey.
Pivot Table – calculated social media preferences based by year and alcohol spending by GPA.
Pivot Chart – created a chart from pivot tables demonstrating social media preferences with filters by university and social media platforms as well as a comparative chart between student GPA and alcohol spending.

Lookup Tables in Cost Estimating
Given a single sheet of digital prepress information containing multiple tables in different forms, I created a Digital Prepress Time sheet.
Nested Lookups – nested use of IF, AND, NOT, OR, ISBLANK, ISTEXT, ISNUMBER, VLOOKUP, and HLOOKUP with MATCH statements allowed for error handling as text feedback in calculated fields when certain information was from dependent columns and calculations if all the dependent columns were correctly filled.
Advanced Conditional Formatting – nested use of IF, AND, NOT, OR, ISBLANK, ISTEXT, and ISNUMBER functions to turn red or blue depending on the outlined conditions.

Advanced Lookups & Conditional Formatting
Given multiple worksheets with information on design and production rates and information, I built a book design project estimation sheet that compiles the total hours and total cost based on design metrics.
Advanced Lookups – Used combination of VLOOKUPs and HLOOKUPs with and without MATCH statements based on the design of the table the information was being pulled from. HLOOKUPs were used to search by row number rather than by column (VLOOKUP).
Naming – Because tables were designed each in different scenarios across multiple worksheets, we had to diversify past just using Table_Array and Menu conventions to include Lookup_Array. Lookup_Array spanned the width of a table whereas Menu would be used to correspond to specific lists, this allowed for me to do VLOOKUPs without a match statement and pull from columns not a part of a Menu.
Advanced Conditional Formatting – using AND, NOT, OR, ISBLANK, ISNUMBER, and ISTEXT functions in a formula to format cells to appear red when certain inputs have not been filled correctly as a method for providing feedback in sheet.

Vlookups
Calculating an Ink Workbook using an ink lookup table.
VLOOKUP – populating ink milage by looking up the dropdown value for paper type in the ink lookup table array and using a MATCH statement to find the ink milage number associated with the ink color as the lookup value.
Naming – creating names for the paper type and ink color with the naming convention “menu” to both create dropdowns and matching to populate values from overall table which uses naming convention “table_array”.

Contextual Menus
Given a menu of different expenses and income sources, I formatted a budget template.
Indirect Data Validation – for one dropdown to determine the dropdown options in another cell connect it via =INDIRECT(Cell).
Conditional Formatting – placing checks on the value of the dropdown to ensure “income” isn’t being used as an expense and to ensure a numerical amount is being returned, changing the color to red if these conditions are met.

Basic Excel
Given a list of paper stock types and their price basis, I completed a basic paper estimation sheet that consisted of calculated cells, input cells, and drop-down menus.
Dropdowns – combine the list items to be in a dropdown to a single name that will then be referenced when creating a dropdown with data validation, using the name as the source type when validating by list
Error handling – when filling cells with calculations, the IFERROR function was used to provide a message that would prompt the person inputting numbers to re-enter whichever number was causing the error as specified in the message
Nested IF statements – in cells where the output was dependent on an input with a dropdown, multiple IF conditions are combined within one formula to provide the output that corresponded to the dropdown item selected.

Data Cleaning
Provided a csv data file with 5000 records of generated contact information data was cleaned to break down multi-informational fields into smaller chunks. The data set provided had categories name, address, phone number, and email which we then cleaned to break into first name, last name, street address, city, state, and zip code.
Arrays – saved honorifics and credentials to reference them the category in a formula to identify and remove them from the name.
Paste – pasting special by value to split name into first and last after removing honorifics and credentials.
Trim – remove extra spaces after honorifics and credentials have been removed.
Substitute – adding extra spaces to isolate and then separate the first and last names.
Flash Fill – automatically detect patterns and separate data.
Text-to-column – split text into separate columns based on an identified delimiter.