Diving Into the Deluge of Data :: Lab 5 :: Stock Viz

Lab 5: Stock Viz

This lab explores visualizing stock data by market cap and percentage changes over small spans of time. The data comes from yahoo finance and is downloaded in CSV format. Our visualization will be a treemap, which are ubiquitous in finance.


This visualization is based on a binary tree. We will reproduce something similar, starting from scratch.

Step 0: Lab Preparation

Step 1: Source Code

Step 2: Grabbing Data

The file fetch.py contains skeleton code to download a CSV file containing stock symbols, market capitalization, and price change percentages over a 50 day moving average. Here is how it works.

Step 3: Scrubbing Data

def stock_info_from(file):
    """
    Takes a CSV file of the form

    COMPANY STOCK_SYMBOL, MARKET_CAP, PERCENT_CHANGE_50_DAYS

    where

    STOCK_SYMBOL is a string
    MARKET_CAP is a string of the form "XX.XXXB" where B = BILLION
    PERCENT_CHANGE_50_DAYS is a string of the form "[+,-]XXX.X%"

    and returns a list of Stocks

    where

    Stock.symbol is the stock symbol
    Stock.cap is an integer (the actuall billion dollar number) and
    Stock.change is is a float where -20.5%  is -0.205

    """

    def to_billion(s):

    def to_float(s):

    def row_to_stock(row):

   
    

Make sure to test your function out from the Python REPL. Your data should look similar to the following.

  >>> import stocks
  >>> stocks.stock_info_from("data.csv")
  [Stock(ATVI, 32650000000, 0.02205), Stock(ADBE, 54360000000, 0.053200000000000004), ..., ]
  >>>
  

Step 4: Making Rectangles

Recall from lecture how we could view the tree produced from build_treemap as partitioning the unit square into a series of rectangles that collectively tile the square.

Our goal in this step is to write a function generate_rects that when given a Tree tree, a Rectangle rect, and an orientation (either 'H' or 'V') returns the tiling list of rectangles.

This function will be similar to the inorder_leaves method of the tree class. It will recusively construct a list of rectangles that correspond to the partition of the unit square induced by the tree. Each level of the recursion splits the given rectangle along the given orientation according to the weights of its left and right children. Consider the figure below where the grey nodes correspond to leaves of the tree (i.e., stocks) with corresponding caps. You can view the rectangle generation process as starting at the top with the rectangle in all black, and splitting that rectangle into two rectangles corresponding to a 3/7 and 4/7 split along the horizontal axis. This process continues until you reach a leaf, which yields one of the rectangles.

Let's break down our code into the base case and the inductive case.

Here is the skeleton code, with the base case filled in.

      def generate_rects(tree, rect=Rectangle(0, 0, 1, 1), orient='H'):
          if tree.leaf():
              return [rect]
          else:

    

You can test your code with the following

      >>> from stocks import Stock
      >>> from viz import generate_rects
      >>> from tree import build_treemap
      >>> generate_rects(build_treemap([Stock('',5,0), Stock('',10,0), Stock('',20,0)]))
      [Rectangle(0, 0, 0.3333333333333333, 0.42857142857142855), Rectangle(0.3333333333333333, 0, 1, 0.42857142857142855), 
       Rectangle(0, 0.42857142857142855, 1, 1)]
      >>> 
    

Step 5: Visualizing Rectangles

The draw_rects function takes an image, a list of N rectangles that collectively tile the unit square, and a list of N colors and draws a projection of each rectangle, filled with the appropriate color, onto the image.

Some notes:

    def draw_rects(im, rects, colors):
        """
        Map and draw rectangles from the unit square onto the image

        :param im: an Image
        :param rects: a list of N rectangles where a rectangle is a pair of points
        :param symbols:  a list of N stock symbols corresponding to the N rectangles"""
    

Step 6: Putting it all Together

The function draw should perform the following:

 

To run your code from the command line use

      $ python viz.py data.csv stocks.png 1024 1024
      

Your visualization should look like this.

Step 7: Submission