Keynotes
Day 1 (08/13/2024)
Updates from Posit
Tuesday, Aug 13 9:00 AM - 10:00 AM PDT
Quarto Updates
Keynote Info
- Presenter: Hadley Wickham
- Quarto dashboards
- Cards are the fundamental unit
- Columns / Rows
format: dashboard
in YAML- Markdown headings to specify display
- Add dashboard components
- Value boxes (highlighting numbers)
- Input panels (way to group together interactive input into sidebars/toolbars)
- Add interactivity
- Native support for Observable JS
- Use frameworks like leaflet, plotly, threejs, etc. via htmlwidgets (R) and Jupyter Widgets (Python)
- Add Shiny & Shiny for Python inputs and outputs directly into code cells
- quarto.org > Guide > Dashboards
- Cards are the fundamental unit
- The future of PDF
- typst is the future of PDF
- Quarto output format: typst
format: typst
- Easy to get started, fast, easier to learn, customizable
- Quarto output format: typst
- Converts html tables to typst properties
- Example: GT table - Grant Chalmers
- Example: PBC report was made with typst
- typst is the future of PDF
- Community contributions
- Native Julia engine
- Quarto extensions
webR
Keynote Info
- Presenter: George Stagg
- webR
- R for WebAssembly; Executive R code in your web browser
- Example: https://webr.r-wasm.org/latest/
- WebAssembly
- Portable binary code format
- High-performance web applications
- Near-native execution speed
- Supported by most modern browsers
- Interactive through JavaScript
- Shinylive
- Serverless Shiny apps powered by WebAssembly Standalone or embedded in Quarto documents
- posit::conf(2023) talk: Running R-Shiny without a Server
- Examples: https://shinylive.io/r/examples/
- Quarto Live | GitHub
- Interactive documents and code exercises in Quarto powered by WebAssembly
format: live-html
; block towebr
- Also works on mobile devices/tablets
- Grading code: Custom feedback algorithms
- Dynamic Documents: WebAssembly + Observable JS
- Works in Python
Partnerships with Databricks and Snowflake
Keynote Info
- Presenter: James Blair
- Infrastructure Management
- Posit Workbench (Databricks)
- Data Access
- Posit Workbench (odbc to connect to Databricks/Snowflake)
- Application Deployment (authentication; data security)
- Posit Connect
Positron
Keynote Info
- Presenter: Hadley Wickham
- New multilingual IDE in Posit built on VS Code
Practical Tips for using Generative AI in Data Science Workflows
Tuesday, Aug 13 4:15 PM - 5:15 PM PDT
Keynote Info
- Presenter: Melissa Van Bussel
- [Slides | GitHub Repo]
- GPT 4.o has great image processing capabilities
- Math
- Example: Processing a math equation written on paper into Quarto code
- Prompt: Convert the text in the image into a Quarto document, using
format: html
and keep the same text formatting (bold, italic, underline, etc.) - Prompt: Convert the text in the image into a Quarto document, using
format: html
, and use styling to make sure that the font styles (e.g., bold, italic) and colors match the image, where possible. Align the math around the equals signs.- Also asking it to add styling to match pen colors
- Prompt: Convert the text in the image into a Quarto document, using
- Example: Processing a math equation written on paper into Quarto code
- Data entry
- Prompt: Convert the table in the image into a downloadable CSV.
- Shows you an interactive preview of the dataset; can also modify it on the fly (e.g., highlight different columns/cells, provide data cleaning/formatting instructions)
- Prompt: Simulate additional observations so that there are 1000 rows. Create two additional variables, “Years of Experience” and “Education level.” Ensure that the generated information and salaries are realistic.
- Prompt: Provide a short overview of the dataset, including the data types for each variable.
- Prompt: Provide summary statistics.
- Prompt: Analyze the dataset and extract interesting facts. Use data storytelling techniques to weave together a short (75 word) paragraph that pulls together key insights.
- Prompt: Convert the table in the image into a downloadable CSV.
- Analysis
- Prompt: Create a data visualization using the salaries dataset. Put average salary on the y-axis and job role on the x-axis. Use a different color for each job role. Create a bar chart that’s faceted by the industry variable. Add a horizontal black line showing the average salary for each industry. Format the y-axis as dollar amounts. Use a modern color palette and a minimal theme. Display a legend on the right.
- Write prompt to be consistent with the grammar of graphics
- Will show you code and graph (you can now run Python code directly within chat)
- Prompt: Create a data visualization using the salaries dataset. Put average salary on the y-axis and job role on the x-axis. Use a different color for each job role. Create a bar chart that’s faceted by the industry variable. Add a horizontal black line showing the average salary for each industry. Format the y-axis as dollar amounts. Use a modern color palette and a minimal theme. Display a legend on the right.
- Creating slides using Quarto themes
- GitHub Pages / GitLab Pages to host a Quarto project
- Creating a logo for a hex sticker
- Transparent background, scaled easily without becoming pixelated
- Text to svg generators
- Example: Recraft AI
- Text prompt and specify height, width, color palette, how simple/detailed, art styles
- Generate mock-ups
- Two tools that use Generative AI to make teaching and communicating about data easier
- Scribe: Automatically create step-by-step tutorials without copy-pasting screenshots or recording videos
- Scribe Chrome extension to create automatic screenshots/instructions (e.g., where you clicked will be highlighted)
- Can share it for free (direct link or by email)
- Can embed as an iframe, for example, on a Quarto website
- Pro-subscription for desktop version of Scribe; can export in variety of formats (including Markdown)
- descript: Edit videos and audio/podcasts as if you’re editing a Word document
- Upload video/audio file and it will be transcribed (available in over 20 different language)
- Once the transcript is ready, you can edit the transcript in order to edit the video itself
- Can remove awkward silences (shorten word gaps) and remove filler words
- AI speech generation capabilities
- Regenerate: You can use your voice clone to regenerate content
- Overdub: You can use to replace wordings
- Other AI features
- Eye contact
- Remove retakes
- YouTube descriptions: from your transcript, descript will generate a YouTube-style description and provide timestamps for sections
- Scribe: Automatically create step-by-step tutorials without copy-pasting screenshots or recording videos
- Guidelines for responsible use
Day 2 (08/14/2024)
A future of data science
Wednesday, Aug 14 9:00 AM - 10:00 AM PDT
Keynote Info
- Presenter: Allen Downey
- [Slides]
- Keynote speaker argues data science exists because statistics missed the boat on computers
- Technological trigger for data science was computation because statistics as a field missed the boat on computation
- Hype of data science
- Technology trigger
- Peak of inflated expectations (2012)
- Trough of disillusionment (2016-)
- Slope of enlightenment
- Plateau of productivity
- Slope of enlightenment
- Two things that make me optimistic
- Improving data literacy
- More and more available data
- One that makes me worry
- Excessive consumption of relentless negative media
- Two things that make me optimistic
Data Wrangling [for Python or R] Like a Boss With DuckDB
Wednesday, Aug 14 4:15 PM - 5:15 PM PDT
Keynote Info
- Presenter: Hannes Mühleisen
DuckDB
package- Advanced CSV reader
- Native support for Parquet files and Arrow structures
- An efficient parallel vectorized query processing engine
- Support for efficient atomic updates to tables
- Advantages
- Zero-dependency package available in multiple languages
- Batteries included: Fully featured data management system
- Fast
- Free!
- Simple queries
- Read dataset (
CREATE TABLE ... AS FROM Data8277.csv
) - Compute summary statistics (
SUMMARIZE FROM "Data8277.csv
)
- Read dataset (
duckplyr
R package:dplyr
+DuckDB