Data Visualiztion

Yangyong Ye

2020-11-20

The greatest value of a picture is when it forces us to notice what we never expected to see.

--- John W. Tukey, 1977

数据可视化的功能在哪?

  • 发现数据中的规律性的结论(pattern recognition)

  • 发现一些新假设(hypothesis generation)

  • 可预见与非可预见?

数据中的规律性的表现形式有哪些?

  • 比大小
  • 看分布
  • 看构成
  • 看相关
  • 看地图
  • 看不确定性

How many variables could be shown on a plot?

  • x
  • y
  • color
  • fill
  • shape
  • size
  • facet(row, column)

Data Visualiztion

  • Graph Type

    • Rankings: Barplot, lollipop/stem,

    • Distribution: histgram, density, boxplot, violin, ridgeline

    • Correlation: scatter, correlogram,

    • Composition: treemap, stacked bar, pie chart, doughnut

Data Visualiztion

  • Graph Type

    • Evolution: line, Area,

    • Maps: background map,

    • Flow: sankey diagram

    • Other: animation & combination

Data Visualiztion

  • Useful packages

    • comprehensive package: ggplot2

    • ggridges, ggrepel, ggthemes

    • viridis, RColorBrewer, colorspace

    • gganimate, patchwork.

Code

Readings for next week

Packages need to be installed