Visualisatie - Module 3 - Big Data

  • Published on

  • View

  • Download


Post-academiccourseBigData Post-academiccourseBigDataJoris KlerkxResearch Manager, PhD.joris.klerkx@cs.kuleuven.beVisualisatieBig DataIVPV - Instituut voor Permanente Vorming28-05-20151Augment group - HCI research lab Dept. ComputerwetenschappenKU Leuvenhttps://augmenthuman.wordpress.com2Erik Duval11/9/1965 12/3/20163Our missionToaugmentthehumanintellect(Engelbart,1962)4By augmen+nghuman intellect we mean increasing the capability of a manto approach a complex problem situa+on, to gain comprehension to suit hisparticular needs, and to derive solu+onstoproblems.Design,buildandevaluaterelevanttoolsandtechnologiesthathelpuserstobecomebeCerintheirdailylife&work(Duval,2015)Our mission5What are relevant user actions?How can we capture signals? How can we store them?How can we create a meaningful feedback loop?Our ResearchPhysiological, behavioural signalsSensors, (self-)trackersInformation visualization Scalable infrastructure6Application DomainsTechnology-Enhanced LearningMedia ConsumptionScience 2.0(e)Health7Slides will be posted to Slideshare & Zephyr8 Data10Big data11Big datainsights12Better Human Understanding13A mental model represents what a person thinks is true but isnt necessarily true14UNDERSTANDING OF THEIR MENTAL MODELS 15Wouter Walgrave - 16"The idea that business is strictly a numbers affair has always struck me as preposterous. For one thing, Ive never been particularly good at numbers, but I think Ive done a reasonable job with feelings. And Im convinced that it is feelings and feelings alone that account for the success of the Virgin brand in all of its myriad forms. -- Richard Branson20Gut feeling21What your gut feeling saysWhat the facts say22What your gut feeling saysWhat the facts sayConfirmation biasUndervalued Overvalued Foolish23Big datainsightsdata-driven insights2425Big datainsightsdata-driven insightsMeaningful26Defining visualization27Definition28Information Visualization is the use of interactive visual representations to amplify cognition [Card. et. al]algorithmhuman29Information Visualisation is the use of interactive visual representations to amplify cognition [Card. et. al]Definition30 human interaction for exploration with and understanding of big data32Data visualization Slidesource:JohnStaskoScientific visualizationInformation visualization33Scientific visualisationSpecifically concerned with data that has a well-defined representation in 2D or 3D space (e.g., from simulation mesh or scanner).Slidesource:RobertPutman 34Information VisualisationConcerned with data that does not have a well-defined representation in 2D or 3D space (i.e., abstract data)35Dispersion (Backstrom & Kleinberg)36The role of visualisation37Big datainsightsdata-driven insightsMeaningful38By Longlivetheux - Own work, CC BY-SA 4.0, Role of visualisationBrehmer, M.; Munzner, T., "A Multi-Level Typology of Abstract Visualization Tasks," Visualization and Computer Graphics, IEEE Transactions on , vol.19, no.12, pp.2376,2385, Dec. 2013 41ExploreData insights: a visualization (Gregor Aisch)42 Visual Analytics43 Big Data44Multiple data sources with varied data types Diverse data I talk geoJSONi talk custom xmli talk apache logs45millions of recordsTall data46http://dataclysm.orgExample: 51 million ratings47http://dataclysm.orgExample: 51 million ratings48http://dataclysm.orgExample: 51 million ratings49http://dataclysm.orgExample: 51 million ratings50 51Cluttered displaysHeer, J. & Kandel, S. (2012), Interactive Analysis of Big Data, XRDS, 19 (1)52Cluttered displaysBinned density scatterplotHexagonal instead of rectangularHeer, J. & Kandel, S. (2012), Interactive Analysis of Big Data, XRDS, 19 (1)53Multi-variate data with 100s to 1000s of variablesWide data54 this day of so-called Big Data, organizations are scrambling to implement new software and hardware to increase the amount of data that they collect and store. In so doing they are unwittingly making it harder to find the needles of useful information in the rapidly growing mounds of hay. If you dont know how to differentiate signals from noise, adding more noise only makes matters worse.55Avoid the All-You-Can-Eat buffet! (Ben Fry)56Visualizations might help reveal multidimensional patternsUse the power of the machine to find a proxy in the data that predicts the selected variablesDepending on their specific questions, domain experts might select a subset of variables they are interested in57Example: 4 million messages/day on OKCupid 58Each dot at 90% transparency 59 60 61 62Multiple views on the data allow exploration of patterns63The strength of visualization 64Anscombe`s quartet's_quartetEnables discovery of visual patterns in data setsGraphics reveal data (Tufte, 2001)65's_quartetWorld Population GrowthA tremendous change occurred with the industrial revolution: whereas it had taken all of human history until around 1800 for world population to reach one billion, the second billion was achieved in only 130 years (1930), the third billion in less than 30 years (1959), the fourth billion in 15 years (1974), and the fifth billion in only 13 years (1987). During the 20th century alone, the population in the world has grown from 1.65 billion to 6 billion.Seeing is understanding66Facilitates understanding human interaction for exploration and understanding stories69 Nagel, M. Maitan, E. Duval, A. Vande Moere, J. Klerkx, K. Kloeckl, and C. Ratti. Touching transport - a case study on visualizing metropolitan public transit on interactive tabletops. In AVI2014: 12th ACM International Working Conference on Advanced Visual Interfaces, pages 281288, 2014. human interaction for exploration and understanding70 there be enough food? insights easily71Triggers Impact patterns & triggers questions72http://terror.periscopic.comInteractivity allows comparison73 trends & anomalies in the data, therefore triggers questions74Helps to find stories, see trendsBelgiumBrazilUSAIndia75Sentiment analysis in enterprise social network (slack)Shows patterns76 Client Tracking ServiceWebSockets Databaseengagement data mouse data10.065 sessies werden getracked9674 sessies werden gebruikt in de analyse391 sessies werden verwijderd uit analyse (noise)78Visualizing Reader ActivityElk vierkant is een slideElke rij stelt een navigatie-patroon voor doorheen de slidesKolom 1 toont absoluut aantal lezersKolom 2 toont het percentage lezers 79262 readers (2.7%) gaan volledig door alle slides, waarna ze snel teruggaan naar de eerste slide om die nog even te bekijken.Lezerstijd per slideLezers spenderen +/- 75 seconden (avg) op de eerste slide om te bestuderen welke informatie voorhanden is. 80Shows patternsSentiment analysis in enterprise social network (slack)Triggers questions & creates awarenessDisclaimer: Should we trust NLP-algorithms? 81Empowers users to make informed decisionsPositive BadgesNegative Badges82Show errors in the data errors in the data84Khaled Bachour, Frederic Kaplan, Pierre Dillenbourg, "An Interactive Table for Supporting Participation Balance in Face-to-Face Collaborative Learning," IEEE Transactions on Learning Technologies, vol. 3, no. 3, pp. 203-213, July-September, 2010 Creates awareness85 http://visualizing.org (big) dataGuidelines & Facts88How many circles? 89Humans have advanced perceptual abilitiesOur brains makes us extremely good at recognizing visual patterns9091Humans have little short term memoryOur brain remembers relatively little of what we perceive.Most of us can only hold three to seven chunks of data at the same time.Humans have little short term memory92RecognitionIdentify previously learned information93Humans have advanced perceptual abilitiesHumans have little short term memoryOur brains makes us extremely good at recognizing visual patternsOur brains remember relatively little of what we perceiveExternalize data by using interactive, visual encodingsPromote recognition rather than recall94 (9:51 - 11:22 )95 centrality of human activity in the process is key97ExploreData insights: a visualization (Gregor Aisch)98Its not a magical algorithm that finds the insight for youYou have to look at the overview, you have to decide what you zoom in to, what you filter out. And then you click to get the detailsBen Shneiderman, 201199 first, zoom & filter, details-on-demand100 first, zoom & filter, details-on-demand first, zoom & filter, details-on-demand102 Information Seeking Mantra103Real data is ugly and needs to be cleaned your data104 check & pre-process your data105Verkiezingen 14/10/12 about 3D graphs (on a 2D screen..)Occlusion Complex to interact with Doesnt add anything to the data106Source: Stephen FewWhat if we need to add a 3rd variable? 107Use small coordinated graphs to add variables108Forget about 3D graphsSource: Stephen FewWhich student has more blogposts? Size & angle are difficult to compare Without labels & legends, impossible to show exact quantitative differences Limited Short term (visual) memory109Source: Stephen FewSave the pies for dessert (S. Few)Try using either of the pies to put the slices in order by demorgen.bevtm.beVerkiezingen 14/10/12111Obviously there are exceptions to the rule112"5"10"15"20"25"30"blogposts" tweets" comments"on"blogs"reports"submi6ed"Student'1'Student"1"0" 5" 10" 15" 20" 25" 30"blogposts"comments"on"blogs"tweets"reports"submi6ed"Student'1'Student"1"Use Common Sense0"5"10"15"20"25"30"blogposts" comments"on"blogs"tweets" reports"submi6ed"Student'1'Student"1"1130" 10" 20" 30" 40" 50" 60"Student"1"Student"2"Student"3"Student"4"blogposts"tweets"comments"on"blogs"reports"submi:ed"0%# 20%# 40%# 60%# 80%# 100%#Student#1#Student#2#Student#3#Student#4#blogposts#tweets#comments#on#blogs#reports#submi;ed#Use Common SenseWhat are you comparing?What story do you get from it?114Which graph makes it easier to focus on the pattern of change through time, instead of the individual values?Choose graph that answers your questions about your data115Source: Stephen deredactie.benieuwsblad.beVerkiezingen 14/10/12Communicate the correct story116Dont use visualisations to mislead117Dont use visualisations to mislead118Source: Stephen Few 119Source: Stephen Few 120121 122 123 much better are the drinking water conditions in Willowtown as compared to Silvatown?124 with visualisation125Visualization tasksBrehmer, M.; Munzner, T., "A Multi-Level Typology of Abstract Visualization Tasks," Visualization and Computer Graphics, IEEE Transactions on , vol.19, no.12, pp.2376,2385, Dec. 2013 126 Perception128Our brains makes us extremely good at recognizing visual patternsSource: Katrien Verbert 129Source: Katrien Verbert 130A limited set of visual properties that are detected - very rapidly (< 200 to 250 ms), - accurately,- with little effort,- before focused attentionby the low-lever visual system on them.Healey,C.,&Enns,J.(2012).ADenEonandVisualMemoryinVisualizaEonandComputerGraphics.IEEETransac+onsonVisualiza+onandComputerGraphics,18(7),1170-1188.Pre-attentive characteristicsNote that eye movements take at least 200 ms to initiate.131Pre-attentive characteristicsFind the red dot HueFind the dot shapeFind the red dotconjunction not pre-attentive to spot differences in multi-element display132Pre-attentive characteristicsLine orientation Length, width Closure SizeCurvature Density, contrast Intersection 3D depthNot all of them allow showing exact quantitative differencesHelps to spot differences in multi-element display133 Laws (Pattern laws)Basic rules or design principles that describe perceptual phenomena.Explain the way users or humans see patterns in visualisations. & Ground135136ClosureSmallness137Source: Katrien VerbertCommon Fate Objects with a common movement, that move in the same direction, at the same pace, at the same time are organised as a group (Ehrenstein, 2004).138Law of IsomorphismIs similarity that can be behavioural or perceptual, and can be a response based on the viewers previous experiences (Luchins & Luchins, 1999; Chang, 2002). This law is the basis for symbolism (Schamber, 1986).139London Tube MapWhich Gestalt laws do you see? 140Visualization design process141B. McDonnel and N. Elmqvist. Towards utilizing gpus in information visualization: A model and implementation of image-space operations. Visualization and Computer Graphics, IEEE Transactions on, 15(6):11051112, 2009. 142 structuretime, hierarchy, network, 1D, 2D, nD, - questions where, when, how often, - audience domain & visualisation expertise, 144S. Stevens. On the theory of scales of measurement. Science, 103(2684), 1946.StructureTime? hierarchical? 1D? 2D? nD? network? 145Questions (to get things going)What is the average amount of students that bought the course book ? What? When? How much? How often?When did students start looking at the course material?How much hours did Peter work on this assignment? (Why did Peter have to redo his assignment?)How often did Peter retake the course before he passed? (why?)146147Visual mappingEncode data characteristics into visual formEach mark (point, line, area,) represents a data elementThink about relationships between elements (position)Simplicity is the ultimate sophistication.Leonardo da VinciSize much bigger is the lower bar?SlideadaptedfromMichaelPorath&KatrienVerbertLength149X5How much bigger is the right circle?SlideadaptedfromMichaelPorath&KatrienVerbertArea150X9How much bigger is the right circle?151Apparent magnitude curves 152 one looks more accurate? SlideadaptedfromMichaelPorath 153Compensating magnitude to match perception ColorColor Principles - Hue, Saturation, and Value maximum +/- 5 colors (for categories,.. ) (short term memory) hue: categorical saturation: ordinal and quantitative luminance/brightness: ordinal and quantitative How to choose colorssource from: Katrien Verbert 155http://colorbrewer2.org156157, sweetness, aroma, bitterness, and quality159How to choose colorsPosition160Position & color Mackinlay. Automating the design of graphical presentations of relational information. ACM Transactions On Graphics, 5(2):110141, 1986.162163J. Mackinlay. Automating the design of graphical presentations of relational information. ACM Transactions On Graphics, 5(2):110141, 1986.164Offer precise controls for sharing on the Internet... Users should navigate through 50 settings with more than 170 optionsExample Facebook privacy statementQuestions?How did its complexity change over time? How does its length compare to privacy statementsof other tools?165How did its complexity change over time? does its length compare to privacy statementsof other tools? Encoding weather forecast on a smartphone168?Joris KlerkxResearch Manager, https://augmenthuman.wordpress.com169Always on-the-look for new opportunities


View more >