项目
- The Cultural Borders of Songs (pudding.cool) 更新了 2021 年 2 月份的 YouTube 数据。
- DataCoaster Tycoon: Building 3D Rollercoaster Tours of Your Data in R – Rayverse Blog (tylermw.com) 但这是为什么呢?实在是太有趣了。这也是近期第二个 dive into visualization 的项目。
- Map of Reddit (anvaka.github.io) 通过获取 reddit 上数以亿计的评论({user, subreddit})将相关联的 subreddit 进行聚类。原帖在 reddit.
- Predicting the Premier League standings through text analytics — Queen’s Sports Analytics Organization (qsao-queens.com) Queen’s University 的 Queen’s Sports Analytics Organization 之前做过通过分析新闻内容预测足坛冬歇转会,这回则是预测联赛冠军的归属。
- “The Minard System” in R (minard.schochastics.net) 作者将 Sandra Rendgen (同时也是下面一集播客提到的嘉宾) 在 2018 年出版的 The Minard System 一书中,法国土木工程师 Charles-Jospeh Minard 所绘制的插图用现在的 R 语言重新制作了一遍。该项目目前还在开发当中,GitHub repo 也是开源的;能回答不少「这是怎么做到的」的疑问。
- SEARCH RECORD (search-record.net) 浏览者可以从 Google Takeout 中下载自己的数据来分析搜索记录。
观点
Over the course of our work together (supported by many people at Stamen) he (Paul Ekman) told me that he never managed to get any of his papers published in a scientific journal without having them rejected at least fifteen times. Here’s his advice:
- Leave no stone unturned
- Never take no for an answer (although, to be clear, in many cases no definitely means no; he means when you’re going after something in a professional, consensual environment)
- Always aim higher than you can see Lastly: especially in the beginning, say yes to any opportunity you have to present your work publicly. Put as much energy as you possibly can into these presentations. It’ll be terrifying at first. Do it anyway.
Since the 1970s at least, data visualization has been governed by a vague consensus—an orthodoxy—that prioritizes bare clarity over playfulness, simplicity over allegedly-gratuitous adornments, supposed objectivity over individual expression. As a consequence, generations of visualization designers grew up in an era of stern and often pious sobriety that sadly degenerated sometimes into the dismissive self-righteousness of popular slurs such as ‘chart junk’. … Nadieh Bremer and Shirley Wu are wondrous eccentrics. Their splendid book is the product of a collaborative experimental project, DataSketch.es, that might be one of the first exponents of an emerging visualization orthodoxy in which uniqueness is paramount and templates and conventions are seen with skepticism.
故事
- Following the Science (pudding.cool) 在疫情爆发之后,相关研究文献的数据变化。
工具
- Using React with D3.js (wattenberger.com)
- ggplot Wizardry: My Favorite Tricks and Secrets for Beautiful Plots in R (cedricscherer.com) Dr. Cédric Scherer 3月份的一个幻灯片讲稿,介绍了一些在使用 {ggplot2} 及其他 {gg-} 系列时的技巧。页数比较多,涉及到的图也比较多,但是讲得很细致。例如:
- seq(10, 100, by = 10) 可以用 1:10 * 10 的简写方式;
- 用 theme_set() 来控制全局主题,之后由 ggplot() 生成的图都会遵循这其中的规范;
- 用 theme_update() 来更新上述设定;
- 如果是有一系列的可视化需要处理,风格的连贯性可以增加项目的可重现性,例如 BBC Visual and Data Journalism cookbook for R graphics;
- 强烈推荐了 #tidytuesday project (github.com) 这个每周一举行一次的线上学习群体(不知道这么称呼准确不准确);
- {ggtext}: improved text rendering support
- element_markdown() markdown syntax for elements like title, caption
- geom_richtext() rotational text labels
- element_textbox(), element_textbox_simple() text boxes with word wrapping
- geom_textbox()
- custom element_textbox_highlight()
- {glue}: glue strings to data in R
- {ggforce}: provide missing functionality to ggplot2
- geom_bspline_closed() creates closed b-spline curves as shapes
- geom_mark_*() advanced labels for single or multiple points, also show groups or highlight interesting parts (fancy annotations!)
- {ggdist}: visualizations of distributions and uncertainty (this is the special missing piece I was looking for!)
- {gggibbous}: moon charts for ggplot2 (it’s like pie charts, but more elegant!)
- {ggstream}: streamgraphs
- 其他一些关于 {ggplot2} 的操作:
- theme(plot.title.position = 'plot') -> left-aligned title
- theme(plot.caption.position = 'plot' -> right-aligned caption
- theme(legend.position = 'top') -> legend on top
- guide(color = guide_colorbar(title.position = 'top', title.hjust = .5, barwidth = unit(20, 'lines'), barheight = unit(.5, 'lines'))) -> slimmer legend
- coord_cartesian(expand = FALSE) -> limit expansion (put the graphs in center)
- coord_cartesian(clip = 'off') -> turn off the coordinates clipping feature for anything out of the coordinates
- theme(plot.margin = margin(x, y, z, m)) -> customized margins (otherwise, by default would be base_size/2)
- annotation_custom(img, ymin, ymax, xmin, xmax) -> add an image, the image is created by either magick::iomage_read() or grid::rasterGrob()
- {patchwork}: combine and arrange ggplots together with simple syntax
- Interactive data tables for R • reactable (glin.github.io) 作者 Greg Lin 写的文档和例子做的尤其好。
- Say Goodbye to “Good Taste” · Data Imaginist (data-imaginist.com) {ggfx} 可以给图增加滤镜了。
播客
- 8 Stunning Data Visualization Examples that Defined 2020 with Alli Torban (leapica.com)
- 164 | Edward Tufte’s complete work with Sandra Rendgen – Data Stories 讨论了 Edward Tufte 早年的研究,出版的五本书,以及他对当下 data viz 的影响。
书籍