Lynn Cherny

Lynn Cherny is a data analysis and visualization consultant and currently a visiting Knight Fellow at University of Miami. She has a Ph.D. in Linguistics from Stanford and an M.Phil. in Computer Speech and Language Processing from Cambridge University. Her career began in research in an HCI group at Bell Labs (later AT&T Labs), but she left research to work in industry as a UI designer. She spent 18 years in various UX, UI, and usability roles in Silicon Valley, Paris, Seattle, and Boston, at companies including Excite, TiVo, Adobe, Autodesk, the Mathworks, and Solidworks.

As a consultant for the past 6 years, Lynn has worked on data visualization and analysis problems including creating customer personas from survey data and interviews, visualizing tax brackets, analyzing bug reports and stack traces, clustering pharmaceutical drug reports, network analysis for topic modeling, entity recognition of company names in news articles, and dashboard design. For the past year, she has taught D3.js for data visualization to MFA students at University of Miami. Her passion is text visualization, and she is developing an educational toolset to make this easier for non-specialists with collaborators at Bocoup.com.

Lynn also moderates the data-vis-jobs list on googlegroups.com and is the co-chair of OpenVis Conference, an open source web visualization conference, held annually in Boston in April.


Workshop: Visualizing Text

WORKSHOP FEE: $300
MATERIALS FEE: None.

Text documents are data too, if you know how to look at them right. Handling text data -- and visualizing texts as data -- requires a few new analytic skills and some creative thinking. In this workshop, we'll take documents apart and redraw them in visual form!

We'll use Python, shell scripting, and simple javascript to analyse your text documents and produce interactive visualizations. I'll provide lots of code samples and tools to get you going for your later projects with text. You'll learn the basics in Natural Language Processing (NLP) and why the basics are so important in your analysis. You can get creative after you crunch the bits.

SKILL LEVEL: Intro / Intermediate
It will be helpful, but not required, to have a basic knowledge of the command line and some knowledge of python and javascript.

OVERVIEW:
• Text genres and documents: the "Big Picture" visual of form
• Tokenization: Breaking up a document into words and counting them. Word clouds and word shapes.
• Lemmas and parts-of-speech and what they are good for
• Comparing documents: TF-IDF, clustering, tree diagrams
• Word2vec: what's all the hype, why is it so cool, and let's do it on your documents!
• (Time allowing): Intro to topic modeling

WHAT TO BRING:
• Laptop that can run Python
• Chrome (ideally)
• A web server
• A text code editor
• Python 2.7
• A terminal program/shell capable of running python and unix shell commands
• Document(s) to analyze, ideally more than one!

Session: Things I Think Are Awesome

I live and love at the intersection of machine learning, story generation, and data visualization. If my personal projects don't make me smile, I feel I failed. We'll tour through some tiny clever work that inspires me from art, AI, games, and sometimes even real life. And I'll show you some of my own digital toys that live between text and visual, inspired by other works that made me laugh.