tidyverse
packagesSource: R for Data Science
data.frame()
function with named vectors as inputExtra: Recall, how pandas data frames in Python are dictionaries of equal-length lists/arrays
df <- data.frame(
x = 1:4,
y = c("a", "b", "c", "d"),
z = c(TRUE, FALSE, FALSE, TRUE)
)
df
x y z 1 1 a TRUE 2 2 b FALSE 3 3 c FALSE 4 4 d TRUE
# str() function applied to data frame is useful in determining variable types
str(df)
'data.frame': 4 obs. of 3 variables: $ x: int 1 2 3 4 $ y: chr "a" "b" "c" "d" $ z: logi TRUE FALSE FALSE TRUE
# dim() function behaves similar to matrix, showing N rows and N columns, respectively
dim(df)
[1] 4 3
# In contrast to matrix length() of data frame displays the length of underlying list
length(df)
[1] 3
l <- list(x = 1:5, y = letters[1:5], z = rep(c(TRUE, FALSE), length.out = 5))
l
$x [1] 1 2 3 4 5 $y [1] "a" "b" "c" "d" "e" $z [1] TRUE FALSE TRUE FALSE TRUE
df <- data.frame(l)
df
x y z 1 1 a TRUE 2 2 b FALSE 3 3 c TRUE 4 4 d FALSE 5 5 e TRUE
str(df)
'data.frame': 5 obs. of 3 variables: $ x: int 1 2 3 4 5 $ y: chr "a" "b" "c" "d" ... $ z: logi TRUE FALSE TRUE FALSE TRUE
data_frame[row_indices, column_indices]
data_frame[row_indices, column_name(s)]
data_frame[column_indices]
data_frame[column_name(s)]
data_frame$column_name
# Like a list
df[c("x", "z")]
x z 1 1 TRUE 2 2 FALSE 3 3 TRUE 4 4 FALSE 5 5 TRUE
# Like a matrix
df[,c("x", "z")]
x z 1 1 TRUE 2 2 FALSE 3 3 TRUE 4 4 FALSE 5 5 TRUE
df[df$y == "b",]
x y z 2 2 b FALSE
rbind()
(row bind) - appends a row to data framecbind()
(column bind) - appends a column to data framerand <- rnorm(5)
rand
[1] 0.51628820 -0.15978300 -0.07196149 2.58399787 0.30255112
df <- cbind(df, rand)
df
x y z rand 1 1 a TRUE 0.51628820 2 2 b FALSE -0.15978300 3 3 c TRUE -0.07196149 4 4 d FALSE 2.58399787 5 5 e TRUE 0.30255112
# Note that a row has to be a list as it contains different data types
r <- list(6, letters[6], FALSE, rnorm(1))
r
[[1]] [1] 6 [[2]] [1] "f" [[3]] [1] FALSE [[4]] [1] 0.1202676
df <- rbind(df, r)
df
x y z rand 1 1 a TRUE 0.51628820 2 2 b FALSE -0.15978300 3 3 c TRUE -0.07196149 4 4 d FALSE 2.58399787 5 5 e TRUE 0.30255112 6 6 f FALSE 0.12026760
# New columns can also be created/modified by assignment
# (if the right-hand side object has correct length)
df["r"] <- rnorm(6)
df
x y z rand r 1 1 a TRUE 0.51628820 -1.4450354 2 2 b FALSE -0.15978300 -1.1488777 3 3 c TRUE -0.07196149 -1.6461525 4 4 d FALSE 2.58399787 1.5966023 5 5 e TRUE 0.30255112 0.7387282 6 6 f FALSE 0.12026760 1.1970729
# Individual columns can also be selected with $ operator
df$r <- df$r + 5
df
x y z rand r 1 1 a TRUE 0.51628820 3.554965 2 2 b FALSE -0.15978300 3.851122 3 3 c TRUE -0.07196149 3.353848 4 4 d FALSE 2.58399787 6.596602 5 5 e TRUE 0.30255112 5.738728 6 6 f FALSE 0.12026760 6.197073
# colnames() or names() attribute for data frames contains column names
colnames(df)
[1] "x" "y" "z" "rand" "r"
colnames(df)[4] <- "rand"
df
x y z rand r 1 1 a TRUE 0.51628820 3.554965 2 2 b FALSE -0.15978300 3.851122 3 3 c TRUE -0.07196149 3.353848 4 4 d FALSE 2.58399787 6.596602 5 5 e TRUE 0.30255112 5.738728 6 6 f FALSE 0.12026760 6.197073
tidyverse
packages¶tidyverse
package ecosystem - rich collection of data science packagesreadr
- data input/output (also readxl
for spreadsheets, haven
for SPSS/Stata)dplyr
- data manipulation (also tidyr
for pivoting)ggplot2
- data visualisationlubridate
- working with dates and timetibble
- enhanced data frameinstall.packages("tidyverse")
tibble::tibble()
functiontibble::as_tibble()
functiontb <- tibble::tibble(
x = 1:4,
y = c("a", "b", "c", "d"),
z = c(TRUE, FALSE, FALSE, TRUE)
)
tb
x y z 1 1 a TRUE 2 2 b FALSE 3 3 c FALSE 4 4 d TRUE
str(tb)
tibble [4 × 3] (S3: tbl_df/tbl/data.frame) $ x: int [1:4] 1 2 3 4 $ y: chr [1:4] "a" "b" "c" "d" $ z: logi [1:4] TRUE FALSE FALSE TRUE
dim(tb)
[1] 4 3
tb[c("x", "z")]
x z 1 1 TRUE 2 2 FALSE 3 3 FALSE 4 4 TRUE
tb[tb$y == "b",]
x y z 1 2 b FALSE
# New columns can also be created/modified by assignment (if the RHS object has correct length)
tb["r"] <- rnorm(4)
tb
x y z r 1 1 a TRUE -1.5858383 2 2 b FALSE -1.7991281 3 3 c FALSE -1.0582633 4 4 d TRUE -0.7937325
# Individual columns can also be selected with $ operator
tb$r <- tb$r + 5
tb
x y z r 1 1 a TRUE 3.414162 2 2 b FALSE 3.200872 3 3 c FALSE 3.941737 4 4 d TRUE 4.206267
# names() attribute for data frames/tibbles contains column names
names(tb)
[1] "x" "y" "z" "r"
names(tb)[4] <- "rand"
tb
x y z rand 1 1 a TRUE 3.414162 2 2 b FALSE 3.200872 3 3 c FALSE 3.941737 4 4 d TRUE 4.206267
dplyr
¶dplyr
- is one of the core packages for data manipulation in tidyverse
Its principal functions are:
filter()
- subset rows from datamutate()
- add new/modify existing variablesrename()
- rename existing variableselect()
- subset columns from dataarrange()
- order data by some variableFor data summary:
group_by()
- aggregate data by some variablesummarise()
- create a summary of aggregated variableslibrary("dplyr")
Attaching package: ‘dplyr’ The following objects are masked from ‘package:stats’: filter, lag The following objects are masked from ‘package:base’: intersect, setdiff, setequal, union
dplyr
examples¶dplyr::filter(tb, y == 'b', z == FALSE)
x y z rand 1 2 b FALSE 3.200872
# Note that dplyr functions do not require enquoted variable names
dplyr::select(tb, x, z)
x z 1 1 TRUE 2 2 FALSE 3 3 FALSE 4 4 TRUE
# We can also use helpful tidyselect functions for more complex rules
dplyr::select(tb, tidyselect::starts_with('r'))
rand 1 3.414162 2 3.200872 3 3.941737 4 4.206267
# Data is not modified in-place, you need to re-assign the results
tb <- dplyr::rename(tb, random = rand)
dplyr::mutate(tb, random_8plus = ifelse(random >= 8, TRUE, FALSE))
x y z random random_8plus 1 1 a TRUE 3.414162 FALSE 2 2 b FALSE 3.200872 FALSE 3 3 c FALSE 3.941737 FALSE 4 4 d TRUE 4.206267 FALSE
%>%
operator¶tidyverse
packages are encouraged to use pipe operator %>%
|>
but it is still relatively uncommon<result> <- <input> %>%
<function_name>(., arg_1, arg_2, ..., arg_n)
<result> <- <input> %>%
<function_name>(arg_1, arg_2, ..., arg_n)
%>%
operator examples¶tb
x y z random 1 1 a TRUE 3.414162 2 2 b FALSE 3.200872 3 3 c FALSE 3.941737 4 4 d TRUE 4.206267
tb <- tb %>%
dplyr::mutate(random_2 = rnorm(4)) %>%
dplyr::filter(z == FALSE)
tb
x y z random random_2 1 2 b FALSE 3.200872 0.6260713 2 3 c FALSE 3.941737 0.6916460
# Pipe %>% can also be used with non-dplyr functions
tb$x %>% .[2]
[1] 3
# Base R pipe operator |> is more restrictive (e.g. tb$x |> `[`(2) doesn't work)
tb |> nrow()
[1] 2
tidyr::pivot_wider()
)tidyr::pivot_longer()
)pivot_wider() |
pivot_longer() |
Source: R for Data Science
tb2 <- tibble::tibble(
country = c("Afghanistan", "Brazil"),
`1999` = c(745, 2666),
`2000` = c(37737, 80488)
)
tb2
country 1999 2000 1 Afghanistan 745 37737 2 Brazil 2666 80488
tb2 <- tb2 %>%
# Note that pivoting functions come 'tidyr' package
tidyr::pivot_longer(cols = c("1999", "2000"), names_to = "year", values_to = "cases")
tb2
country year cases 1 Afghanistan 1999 745 2 Afghanistan 2000 37737 3 Brazil 1999 2666 4 Brazil 2000 80488
tb2 <- tb2 %>%
tidyr::pivot_wider(names_from = "year", values_from = "cases")
tb2
country 1999 2000 1 Afghanistan 745 37737 2 Brazil 2666 80488
.csv
(Comma-separated value) files for storing tabular data.rds
(R data serialization) files allow to store single R objectpickle
.rda
(R data) files for saving and loading multiple R objects.feather
/.parquet
- big data formats associated with Apache Hadoop ecosystem.csv
(Comma-separated value) read.csv()
/write.csv()
- base R functionsreadr::read_csv()
/readr::write_csv()
- functions from readr
package in tidyverse
.rds
(R data serialization) readRDS()
/writeRDS()
- base R functionsreadr::read_rds()
/readr::write_rds()
- functions from readr
(no default compression).rda
(R data)save()
/load()
- base R functions.feather
/.parquet
arrow::read_feather()
/arrow::write_feather()
- functions fromarrow::read_parquet()
/arrow::write_parquet()
- arrow
package in Apache Arrow# We are skipping the first row as this dataset has a composite header of 2 rows (variable name, question)
kaggle2020 <- readr::read_csv('../data/kaggle_survey_2020_responses.csv', skip = 1)
Warning message: “One or more parsing issues, see `problems()` for details” Rows: 20036 Columns: 355 ── Column specification ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Delimiter: "," chr (353): What is your age (# years)?, What is your gender? - Selected Choi... dbl (1): Duration (in seconds) lgl (1): Which of the following business intelligence tools do you use on ... ℹ Use `spec()` to retrieve the full column specification for this data. ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(kaggle2020)
Duration (in seconds) | What is your age (# years)? | What is your gender? - Selected Choice | In which country do you currently reside? | What is the highest level of formal education that you have attained or plan to attain within the next 2 years? | Select the title most similar to your current role (or most recent title if retired): - Selected Choice | For how many years have you been writing code and/or programming? | What programming languages do you use on a regular basis? (Select all that apply) - Selected Choice - Python | What programming languages do you use on a regular basis? (Select all that apply) - Selected Choice - R | What programming languages do you use on a regular basis? (Select all that apply) - Selected Choice - SQL | ⋯ | In the next 2 years, do you hope to become more familiar with any of these tools for managing ML experiments? (Select all that apply) - Selected Choice - Weights & Biases | In the next 2 years, do you hope to become more familiar with any of these tools for managing ML experiments? (Select all that apply) - Selected Choice - Comet.ml | In the next 2 years, do you hope to become more familiar with any of these tools for managing ML experiments? (Select all that apply) - Selected Choice - Sacred + Omniboard | In the next 2 years, do you hope to become more familiar with any of these tools for managing ML experiments? (Select all that apply) - Selected Choice - TensorBoard | In the next 2 years, do you hope to become more familiar with any of these tools for managing ML experiments? (Select all that apply) - Selected Choice - Guild.ai | In the next 2 years, do you hope to become more familiar with any of these tools for managing ML experiments? (Select all that apply) - Selected Choice - Polyaxon | In the next 2 years, do you hope to become more familiar with any of these tools for managing ML experiments? (Select all that apply) - Selected Choice - Trains | In the next 2 years, do you hope to become more familiar with any of these tools for managing ML experiments? (Select all that apply) - Selected Choice - Domino Model Monitor | In the next 2 years, do you hope to become more familiar with any of these tools for managing ML experiments? (Select all that apply) - Selected Choice - None | In the next 2 years, do you hope to become more familiar with any of these tools for managing ML experiments? (Select all that apply) - Selected Choice - Other |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
<dbl> | <chr> | <chr> | <chr> | <chr> | <chr> | <chr> | <chr> | <chr> | <chr> | ⋯ | <chr> | <chr> | <chr> | <chr> | <chr> | <chr> | <chr> | <chr> | <chr> | <chr> |
1838 | 35-39 | Man | Colombia | Doctoral degree | Student | 5-10 years | Python | R | SQL | ⋯ | NA | NA | NA | TensorBoard | NA | NA | NA | NA | NA | NA |
289287 | 30-34 | Man | United States of America | Master’s degree | Data Engineer | 5-10 years | Python | R | SQL | ⋯ | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA |
860 | 35-39 | Man | Argentina | Bachelor’s degree | Software Engineer | 10-20 years | NA | NA | NA | ⋯ | NA | NA | NA | NA | NA | NA | NA | NA | None | NA |
507 | 30-34 | Man | United States of America | Master’s degree | Data Scientist | 5-10 years | Python | NA | SQL | ⋯ | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA |
78 | 30-34 | Man | Japan | Master’s degree | Software Engineer | 3-5 years | Python | NA | NA | ⋯ | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA |
401 | 30-34 | Man | India | Bachelor’s degree | Data Analyst | < 1 years | Python | R | NA | ⋯ | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA |
# Note that summary() as opposed to pandas' describe() gives summary for all variable types by default
summary(kaggle2020)
Duration (in seconds) What is your age (# years)? Min. : 20 Length:20036 1st Qu.: 398 Class :character Median : 626 Mode :character Mean : 9156 3rd Qu.: 1030 Max. :1144493 What is your gender? - Selected Choice Length:20036 Class :character Mode :character In which country do you currently reside? Length:20036 Class :character Mode :character What is the highest level of formal education that you have attained or plan to attain within the next 2 years? Length:20036 Class :character Mode :character Select the title most similar to your current role (or most recent title if retired): - Selected Choice Length:20036 Class :character Mode :character For how many years have you been writing code and/or programming? Length:20036 Class :character Mode :character What programming languages do you use on a regular basis? (Select all that apply) - Selected Choice - Python Length:20036 Class :character Mode :character What programming languages do you use on a regular basis? (Select all that apply) - Selected Choice - R Length:20036 Class :character Mode :character What programming languages do you use on a regular basis? (Select all that apply) - Selected Choice - SQL Length:20036 Class :character Mode :character What programming languages do you use on a regular basis? (Select all that apply) - Selected Choice - C Length:20036 Class :character Mode :character What programming languages do you use on a regular basis? (Select all that apply) - Selected Choice - C++ Length:20036 Class :character Mode :character What programming languages do you use on a regular basis? (Select all that apply) - Selected Choice - Java Length:20036 Class :character Mode :character What programming languages do you use on a regular basis? (Select all that apply) - Selected Choice - Javascript Length:20036 Class :character Mode :character What programming languages do you use on a regular basis? (Select all that apply) - Selected Choice - Julia Length:20036 Class :character Mode :character What programming languages do you use on a regular basis? (Select all that apply) - Selected Choice - Swift Length:20036 Class :character Mode :character What programming languages do you use on a regular basis? (Select all that apply) - Selected Choice - Bash Length:20036 Class :character Mode :character What programming languages do you use on a regular basis? (Select all that apply) - Selected Choice - MATLAB Length:20036 Class :character Mode :character What programming languages do you use on a regular basis? (Select all that apply) - Selected Choice - None Length:20036 Class :character Mode :character What programming languages do you use on a regular basis? (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character What programming language would you recommend an aspiring data scientist to learn first? - Selected Choice Length:20036 Class :character Mode :character Which of the following integrated development environments (IDE's) do you use on a regular basis? (Select all that apply) - Selected Choice - Jupyter (JupyterLab, Jupyter Notebooks, etc) Length:20036 Class :character Mode :character Which of the following integrated development environments (IDE's) do you use on a regular basis? (Select all that apply) - Selected Choice - RStudio Length:20036 Class :character Mode :character Which of the following integrated development environments (IDE's) do you use on a regular basis? (Select all that apply) - Selected Choice - Visual Studio / Visual Studio Code Length:20036 Class :character Mode :character Which of the following integrated development environments (IDE's) do you use on a regular basis? (Select all that apply) - Selected Choice - Click to write Choice 13 Length:20036 Class :character Mode :character Which of the following integrated development environments (IDE's) do you use on a regular basis? (Select all that apply) - Selected Choice - PyCharm Length:20036 Class :character Mode :character Which of the following integrated development environments (IDE's) do you use on a regular basis? (Select all that apply) - Selected Choice - Spyder Length:20036 Class :character Mode :character Which of the following integrated development environments (IDE's) do you use on a regular basis? (Select all that apply) - Selected Choice - Notepad++ Length:20036 Class :character Mode :character Which of the following integrated development environments (IDE's) do you use on a regular basis? (Select all that apply) - Selected Choice - Sublime Text Length:20036 Class :character Mode :character Which of the following integrated development environments (IDE's) do you use on a regular basis? (Select all that apply) - Selected Choice - Vim / Emacs Length:20036 Class :character Mode :character Which of the following integrated development environments (IDE's) do you use on a regular basis? (Select all that apply) - Selected Choice - MATLAB Length:20036 Class :character Mode :character Which of the following integrated development environments (IDE's) do you use on a regular basis? (Select all that apply) - Selected Choice - None Length:20036 Class :character Mode :character Which of the following integrated development environments (IDE's) do you use on a regular basis? (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character Which of the following hosted notebook products do you use on a regular basis? (Select all that apply) - Selected Choice - Kaggle Notebooks Length:20036 Class :character Mode :character Which of the following hosted notebook products do you use on a regular basis? (Select all that apply) - Selected Choice - Colab Notebooks Length:20036 Class :character Mode :character Which of the following hosted notebook products do you use on a regular basis? (Select all that apply) - Selected Choice - Azure Notebooks Length:20036 Class :character Mode :character Which of the following hosted notebook products do you use on a regular basis? (Select all that apply) - Selected Choice - Paperspace / Gradient Length:20036 Class :character Mode :character Which of the following hosted notebook products do you use on a regular basis? (Select all that apply) - Selected Choice - Binder / JupyterHub Length:20036 Class :character Mode :character Which of the following hosted notebook products do you use on a regular basis? (Select all that apply) - Selected Choice - Code Ocean Length:20036 Class :character Mode :character Which of the following hosted notebook products do you use on a regular basis? (Select all that apply) - Selected Choice - IBM Watson Studio Length:20036 Class :character Mode :character Which of the following hosted notebook products do you use on a regular basis? (Select all that apply) - Selected Choice - Amazon Sagemaker Studio Length:20036 Class :character Mode :character Which of the following hosted notebook products do you use on a regular basis? (Select all that apply) - Selected Choice - Amazon EMR Notebooks Length:20036 Class :character Mode :character Which of the following hosted notebook products do you use on a regular basis? (Select all that apply) - Selected Choice - Google Cloud AI Platform Notebooks Length:20036 Class :character Mode :character Which of the following hosted notebook products do you use on a regular basis? (Select all that apply) - Selected Choice - Google Cloud Datalab Notebooks Length:20036 Class :character Mode :character Which of the following hosted notebook products do you use on a regular basis? (Select all that apply) - Selected Choice - Databricks Collaborative Notebooks Length:20036 Class :character Mode :character Which of the following hosted notebook products do you use on a regular basis? (Select all that apply) - Selected Choice - None Length:20036 Class :character Mode :character Which of the following hosted notebook products do you use on a regular basis? (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character What type of computing platform do you use most often for your data science projects? - Selected Choice Length:20036 Class :character Mode :character Which types of specialized hardware do you use on a regular basis? (Select all that apply) - Selected Choice - GPUs Length:20036 Class :character Mode :character Which types of specialized hardware do you use on a regular basis? (Select all that apply) - Selected Choice - TPUs Length:20036 Class :character Mode :character Which types of specialized hardware do you use on a regular basis? (Select all that apply) - Selected Choice - None Length:20036 Class :character Mode :character Which types of specialized hardware do you use on a regular basis? (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character Approximately how many times have you used a TPU (tensor processing unit)? Length:20036 Class :character Mode :character What data visualization libraries or tools do you use on a regular basis? (Select all that apply) - Selected Choice - Matplotlib Length:20036 Class :character Mode :character What data visualization libraries or tools do you use on a regular basis? (Select all that apply) - Selected Choice - Seaborn Length:20036 Class :character Mode :character What data visualization libraries or tools do you use on a regular basis? (Select all that apply) - Selected Choice - Plotly / Plotly Express Length:20036 Class :character Mode :character What data visualization libraries or tools do you use on a regular basis? (Select all that apply) - Selected Choice - Ggplot / ggplot2 Length:20036 Class :character Mode :character What data visualization libraries or tools do you use on a regular basis? (Select all that apply) - Selected Choice - Shiny Length:20036 Class :character Mode :character What data visualization libraries or tools do you use on a regular basis? (Select all that apply) - Selected Choice - D3 js Length:20036 Class :character Mode :character What data visualization libraries or tools do you use on a regular basis? (Select all that apply) - Selected Choice - Altair Length:20036 Class :character Mode :character What data visualization libraries or tools do you use on a regular basis? (Select all that apply) - Selected Choice - Bokeh Length:20036 Class :character Mode :character What data visualization libraries or tools do you use on a regular basis? (Select all that apply) - Selected Choice - Geoplotlib Length:20036 Class :character Mode :character What data visualization libraries or tools do you use on a regular basis? (Select all that apply) - Selected Choice - Leaflet / Folium Length:20036 Class :character Mode :character What data visualization libraries or tools do you use on a regular basis? (Select all that apply) - Selected Choice - None Length:20036 Class :character Mode :character What data visualization libraries or tools do you use on a regular basis? (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character For how many years have you used machine learning methods? Length:20036 Class :character Mode :character Which of the following machine learning frameworks do you use on a regular basis? (Select all that apply) - Selected Choice - Scikit-learn Length:20036 Class :character Mode :character Which of the following machine learning frameworks do you use on a regular basis? (Select all that apply) - Selected Choice - TensorFlow Length:20036 Class :character Mode :character Which of the following machine learning frameworks do you use on a regular basis? (Select all that apply) - Selected Choice - Keras Length:20036 Class :character Mode :character Which of the following machine learning frameworks do you use on a regular basis? (Select all that apply) - Selected Choice - PyTorch Length:20036 Class :character Mode :character Which of the following machine learning frameworks do you use on a regular basis? (Select all that apply) - Selected Choice - Fast.ai Length:20036 Class :character Mode :character Which of the following machine learning frameworks do you use on a regular basis? (Select all that apply) - Selected Choice - MXNet Length:20036 Class :character Mode :character Which of the following machine learning frameworks do you use on a regular basis? (Select all that apply) - Selected Choice - Xgboost Length:20036 Class :character Mode :character Which of the following machine learning frameworks do you use on a regular basis? (Select all that apply) - Selected Choice - LightGBM Length:20036 Class :character Mode :character Which of the following machine learning frameworks do you use on a regular basis? (Select all that apply) - Selected Choice - CatBoost Length:20036 Class :character Mode :character Which of the following machine learning frameworks do you use on a regular basis? (Select all that apply) - Selected Choice - Prophet Length:20036 Class :character Mode :character Which of the following machine learning frameworks do you use on a regular basis? (Select all that apply) - Selected Choice - H2O 3 Length:20036 Class :character Mode :character Which of the following machine learning frameworks do you use on a regular basis? (Select all that apply) - Selected Choice - Caret Length:20036 Class :character Mode :character Which of the following machine learning frameworks do you use on a regular basis? (Select all that apply) - Selected Choice - Tidymodels Length:20036 Class :character Mode :character Which of the following machine learning frameworks do you use on a regular basis? (Select all that apply) - Selected Choice - JAX Length:20036 Class :character Mode :character Which of the following machine learning frameworks do you use on a regular basis? (Select all that apply) - Selected Choice - None Length:20036 Class :character Mode :character Which of the following machine learning frameworks do you use on a regular basis? (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character Which of the following ML algorithms do you use on a regular basis? (Select all that apply): - Selected Choice - Linear or Logistic Regression Length:20036 Class :character Mode :character Which of the following ML algorithms do you use on a regular basis? (Select all that apply): - Selected Choice - Decision Trees or Random Forests Length:20036 Class :character Mode :character Which of the following ML algorithms do you use on a regular basis? (Select all that apply): - Selected Choice - Gradient Boosting Machines (xgboost, lightgbm, etc) Length:20036 Class :character Mode :character Which of the following ML algorithms do you use on a regular basis? (Select all that apply): - Selected Choice - Bayesian Approaches Length:20036 Class :character Mode :character Which of the following ML algorithms do you use on a regular basis? (Select all that apply): - Selected Choice - Evolutionary Approaches Length:20036 Class :character Mode :character Which of the following ML algorithms do you use on a regular basis? (Select all that apply): - Selected Choice - Dense Neural Networks (MLPs, etc) Length:20036 Class :character Mode :character Which of the following ML algorithms do you use on a regular basis? (Select all that apply): - Selected Choice - Convolutional Neural Networks Length:20036 Class :character Mode :character Which of the following ML algorithms do you use on a regular basis? (Select all that apply): - Selected Choice - Generative Adversarial Networks Length:20036 Class :character Mode :character Which of the following ML algorithms do you use on a regular basis? (Select all that apply): - Selected Choice - Recurrent Neural Networks Length:20036 Class :character Mode :character Which of the following ML algorithms do you use on a regular basis? (Select all that apply): - Selected Choice - Transformer Networks (BERT, gpt-3, etc) Length:20036 Class :character Mode :character Which of the following ML algorithms do you use on a regular basis? (Select all that apply): - Selected Choice - None Length:20036 Class :character Mode :character Which of the following ML algorithms do you use on a regular basis? (Select all that apply): - Selected Choice - Other Length:20036 Class :character Mode :character Which categories of computer vision methods do you use on a regular basis? (Select all that apply) - Selected Choice - General purpose image/video tools (PIL, cv2, skimage, etc) Length:20036 Class :character Mode :character Which categories of computer vision methods do you use on a regular basis? (Select all that apply) - Selected Choice - Image segmentation methods (U-Net, Mask R-CNN, etc) Length:20036 Class :character Mode :character Which categories of computer vision methods do you use on a regular basis? (Select all that apply) - Selected Choice - Object detection methods (YOLOv3, RetinaNet, etc) Length:20036 Class :character Mode :character Which categories of computer vision methods do you use on a regular basis? (Select all that apply) - Selected Choice - Image classification and other general purpose networks (VGG, Inception, ResNet, ResNeXt, NASNet, EfficientNet, etc) Length:20036 Class :character Mode :character Which categories of computer vision methods do you use on a regular basis? (Select all that apply) - Selected Choice - Generative Networks (GAN, VAE, etc) Length:20036 Class :character Mode :character Which categories of computer vision methods do you use on a regular basis? (Select all that apply) - Selected Choice - None Length:20036 Class :character Mode :character Which categories of computer vision methods do you use on a regular basis? (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character Which of the following natural language processing (NLP) methods do you use on a regular basis? (Select all that apply) - Selected Choice - Word embeddings/vectors (GLoVe, fastText, word2vec) Length:20036 Class :character Mode :character Which of the following natural language processing (NLP) methods do you use on a regular basis? (Select all that apply) - Selected Choice - Encoder-decorder models (seq2seq, vanilla transformers) Length:20036 Class :character Mode :character Which of the following natural language processing (NLP) methods do you use on a regular basis? (Select all that apply) - Selected Choice - Contextualized embeddings (ELMo, CoVe) Length:20036 Class :character Mode :character Which of the following natural language processing (NLP) methods do you use on a regular basis? (Select all that apply) - Selected Choice - Transformer language models (GPT-3, BERT, XLnet, etc) Length:20036 Class :character Mode :character Which of the following natural language processing (NLP) methods do you use on a regular basis? (Select all that apply) - Selected Choice - None Length:20036 Class :character Mode :character Which of the following natural language processing (NLP) methods do you use on a regular basis? (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character What is the size of the company where you are employed? Length:20036 Class :character Mode :character Approximately how many individuals are responsible for data science workloads at your place of business? Length:20036 Class :character Mode :character Does your current employer incorporate machine learning methods into their business? Length:20036 Class :character Mode :character Select any activities that make up an important part of your role at work: (Select all that apply) - Selected Choice - Analyze and understand data to influence product or business decisions Length:20036 Class :character Mode :character Select any activities that make up an important part of your role at work: (Select all that apply) - Selected Choice - Build and/or run the data infrastructure that my business uses for storing, analyzing, and operationalizing data Length:20036 Class :character Mode :character Select any activities that make up an important part of your role at work: (Select all that apply) - Selected Choice - Build prototypes to explore applying machine learning to new areas Length:20036 Class :character Mode :character Select any activities that make up an important part of your role at work: (Select all that apply) - Selected Choice - Build and/or run a machine learning service that operationally improves my product or workflows Length:20036 Class :character Mode :character Select any activities that make up an important part of your role at work: (Select all that apply) - Selected Choice - Experimentation and iteration to improve existing ML models Length:20036 Class :character Mode :character Select any activities that make up an important part of your role at work: (Select all that apply) - Selected Choice - Do research that advances the state of the art of machine learning Length:20036 Class :character Mode :character Select any activities that make up an important part of your role at work: (Select all that apply) - Selected Choice - None of these activities are an important part of my role at work Length:20036 Class :character Mode :character Select any activities that make up an important part of your role at work: (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character What is your current yearly compensation (approximate $USD)? Length:20036 Class :character Mode :character Approximately how much money have you (or your team) spent on machine learning and/or cloud computing services at home (or at work) in the past 5 years (approximate $USD)? Length:20036 Class :character Mode :character Which of the following cloud computing platforms do you use on a regular basis? (Select all that apply) - Selected Choice - Amazon Web Services (AWS) Length:20036 Class :character Mode :character Which of the following cloud computing platforms do you use on a regular basis? (Select all that apply) - Selected Choice - Microsoft Azure Length:20036 Class :character Mode :character Which of the following cloud computing platforms do you use on a regular basis? (Select all that apply) - Selected Choice - Google Cloud Platform (GCP) Length:20036 Class :character Mode :character Which of the following cloud computing platforms do you use on a regular basis? (Select all that apply) - Selected Choice - IBM Cloud / Red Hat Length:20036 Class :character Mode :character Which of the following cloud computing platforms do you use on a regular basis? (Select all that apply) - Selected Choice - Oracle Cloud Length:20036 Class :character Mode :character Which of the following cloud computing platforms do you use on a regular basis? (Select all that apply) - Selected Choice - SAP Cloud Length:20036 Class :character Mode :character Which of the following cloud computing platforms do you use on a regular basis? (Select all that apply) - Selected Choice - Salesforce Cloud Length:20036 Class :character Mode :character Which of the following cloud computing platforms do you use on a regular basis? (Select all that apply) - Selected Choice - VMware Cloud Length:20036 Class :character Mode :character Which of the following cloud computing platforms do you use on a regular basis? (Select all that apply) - Selected Choice - Alibaba Cloud Length:20036 Class :character Mode :character Which of the following cloud computing platforms do you use on a regular basis? (Select all that apply) - Selected Choice - Tencent Cloud Length:20036 Class :character Mode :character Which of the following cloud computing platforms do you use on a regular basis? (Select all that apply) - Selected Choice - None Length:20036 Class :character Mode :character Which of the following cloud computing platforms do you use on a regular basis? (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character Do you use any of the following cloud computing products on a regular basis? (Select all that apply) - Selected Choice - Amazon EC2 Length:20036 Class :character Mode :character Do you use any of the following cloud computing products on a regular basis? (Select all that apply) - Selected Choice - AWS Lambda Length:20036 Class :character Mode :character Do you use any of the following cloud computing products on a regular basis? (Select all that apply) - Selected Choice - Amazon Elastic Container Service Length:20036 Class :character Mode :character Do you use any of the following cloud computing products on a regular basis? (Select all that apply) - Selected Choice - Azure Cloud Services Length:20036 Class :character Mode :character Do you use any of the following cloud computing products on a regular basis? (Select all that apply) - Selected Choice - Microsoft Azure Container Instances Length:20036 Class :character Mode :character Do you use any of the following cloud computing products on a regular basis? (Select all that apply) - Selected Choice - Azure Functions Length:20036 Class :character Mode :character Do you use any of the following cloud computing products on a regular basis? (Select all that apply) - Selected Choice - Google Cloud Compute Engine Length:20036 Class :character Mode :character Do you use any of the following cloud computing products on a regular basis? (Select all that apply) - Selected Choice - Google Cloud Functions Length:20036 Class :character Mode :character Do you use any of the following cloud computing products on a regular basis? (Select all that apply) - Selected Choice - Google Cloud Run Length:20036 Class :character Mode :character Do you use any of the following cloud computing products on a regular basis? (Select all that apply) - Selected Choice - Google Cloud App Engine Length:20036 Class :character Mode :character Do you use any of the following cloud computing products on a regular basis? (Select all that apply) - Selected Choice - No / None Length:20036 Class :character Mode :character Do you use any of the following cloud computing products on a regular basis? (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character Do you use any of the following machine learning products on a regular basis? (Select all that apply) - Selected Choice - Amazon SageMaker Length:20036 Class :character Mode :character Do you use any of the following machine learning products on a regular basis? (Select all that apply) - Selected Choice - Amazon Forecast Length:20036 Class :character Mode :character Do you use any of the following machine learning products on a regular basis? (Select all that apply) - Selected Choice - Amazon Rekognition Length:20036 Class :character Mode :character Do you use any of the following machine learning products on a regular basis? (Select all that apply) - Selected Choice - Azure Machine Learning Studio Length:20036 Class :character Mode :character Do you use any of the following machine learning products on a regular basis? (Select all that apply) - Selected Choice - Azure Cognitive Services Length:20036 Class :character Mode :character Do you use any of the following machine learning products on a regular basis? (Select all that apply) - Selected Choice - Google Cloud AI Platform / Google Cloud ML Engine Length:20036 Class :character Mode :character Do you use any of the following machine learning products on a regular basis? (Select all that apply) - Selected Choice - Google Cloud Video AI Length:20036 Class :character Mode :character Do you use any of the following machine learning products on a regular basis? (Select all that apply) - Selected Choice - Google Cloud Natural Language Length:20036 Class :character Mode :character Do you use any of the following machine learning products on a regular basis? (Select all that apply) - Selected Choice - Google Cloud Vision AI Length:20036 Class :character Mode :character Do you use any of the following machine learning products on a regular basis? (Select all that apply) - Selected Choice - No / None Length:20036 Class :character Mode :character Do you use any of the following machine learning products on a regular basis? (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you use on a regular basis? (Select all that apply) - Selected Choice - MySQL Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you use on a regular basis? (Select all that apply) - Selected Choice - PostgresSQL Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you use on a regular basis? (Select all that apply) - Selected Choice - SQLite Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you use on a regular basis? (Select all that apply) - Selected Choice - Oracle Database Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you use on a regular basis? (Select all that apply) - Selected Choice - MongoDB Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you use on a regular basis? (Select all that apply) - Selected Choice - Snowflake Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you use on a regular basis? (Select all that apply) - Selected Choice - IBM Db2 Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you use on a regular basis? (Select all that apply) - Selected Choice - Microsoft SQL Server Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you use on a regular basis? (Select all that apply) - Selected Choice - Microsoft Access Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you use on a regular basis? (Select all that apply) - Selected Choice - Microsoft Azure Data Lake Storage Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you use on a regular basis? (Select all that apply) - Selected Choice - Amazon Redshift Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you use on a regular basis? (Select all that apply) - Selected Choice - Amazon Athena Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you use on a regular basis? (Select all that apply) - Selected Choice - Amazon DynamoDB Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you use on a regular basis? (Select all that apply) - Selected Choice - Google Cloud BigQuery Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you use on a regular basis? (Select all that apply) - Selected Choice - Google Cloud SQL Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you use on a regular basis? (Select all that apply) - Selected Choice - Google Cloud Firestore Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you use on a regular basis? (Select all that apply) - Selected Choice - None Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you use on a regular basis? (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character Which of the following big data products (relational database, data warehouse, data lake, or similar) do you use most often? - Selected Choice Length:20036 Class :character Mode :character Which of the following business intelligence tools do you use on a regular basis? (Select all that apply) - Selected Choice - Amazon QuickSight Length:20036 Class :character Mode :character Which of the following business intelligence tools do you use on a regular basis? (Select all that apply) - Selected Choice - Microsoft Power BI Length:20036 Class :character Mode :character Which of the following business intelligence tools do you use on a regular basis? (Select all that apply) - Selected Choice - Google Data Studio Length:20036 Class :character Mode :character Which of the following business intelligence tools do you use on a regular basis? (Select all that apply) - Selected Choice - Looker Length:20036 Class :character Mode :character Which of the following business intelligence tools do you use on a regular basis? (Select all that apply) - Selected Choice - Tableau Length:20036 Class :character Mode :character Which of the following business intelligence tools do you use on a regular basis? (Select all that apply) - Selected Choice - Salesforce Length:20036 Class :character Mode :character Which of the following business intelligence tools do you use on a regular basis? (Select all that apply) - Selected Choice - Einstein Analytics Length:20036 Class :character Mode :character Which of the following business intelligence tools do you use on a regular basis? (Select all that apply) - Selected Choice - Qlik Length:20036 Class :character Mode :character Which of the following business intelligence tools do you use on a regular basis? (Select all that apply) - Selected Choice - Domo Mode:logical NA's:20036 Which of the following business intelligence tools do you use on a regular basis? (Select all that apply) - Selected Choice - TIBCO Spotfire Length:20036 Class :character Mode :character Which of the following business intelligence tools do you use on a regular basis? (Select all that apply) - Selected Choice - Alteryx Length:20036 Class :character Mode :character Which of the following business intelligence tools do you use on a regular basis? (Select all that apply) - Selected Choice - Sisense Length:20036 Class :character Mode :character Which of the following business intelligence tools do you use on a regular basis? (Select all that apply) - Selected Choice - SAP Analytics Cloud Length:20036 Class :character Mode :character Which of the following business intelligence tools do you use on a regular basis? (Select all that apply) - Selected Choice - None Length:20036 Class :character Mode :character Which of the following business intelligence tools do you use on a regular basis? (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character Which of the following business intelligence tools do you use most often? - Selected Choice Length:20036 Class :character Mode :character Do you use any automated machine learning tools (or partial AutoML tools) on a regular basis? (Select all that apply) - Selected Choice - Automated data augmentation (e.g. imgaug, albumentations) Length:20036 Class :character Mode :character Do you use any automated machine learning tools (or partial AutoML tools) on a regular basis? (Select all that apply) - Selected Choice - Automated feature engineering/selection (e.g. tpot, boruta_py) Length:20036 Class :character Mode :character Do you use any automated machine learning tools (or partial AutoML tools) on a regular basis? (Select all that apply) - Selected Choice - Automated model selection (e.g. auto-sklearn, xcessiv) Length:20036 Class :character Mode :character Do you use any automated machine learning tools (or partial AutoML tools) on a regular basis? (Select all that apply) - Selected Choice - Automated model architecture searches (e.g. darts, enas) Length:20036 Class :character Mode :character Do you use any automated machine learning tools (or partial AutoML tools) on a regular basis? (Select all that apply) - Selected Choice - Automated hyperparameter tuning (e.g. hyperopt, ray.tune, Vizier) Length:20036 Class :character Mode :character Do you use any automated machine learning tools (or partial AutoML tools) on a regular basis? (Select all that apply) - Selected Choice - Automation of full ML pipelines (e.g. Google AutoML, H20 Driverless AI) Length:20036 Class :character Mode :character Do you use any automated machine learning tools (or partial AutoML tools) on a regular basis? (Select all that apply) - Selected Choice - No / None Length:20036 Class :character Mode :character Do you use any automated machine learning tools (or partial AutoML tools) on a regular basis? (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character Which of the following automated machine learning tools (or partial AutoML tools) do you use on a regular basis? (Select all that apply) - Selected Choice - Google Cloud AutoML Length:20036 Class :character Mode :character Which of the following automated machine learning tools (or partial AutoML tools) do you use on a regular basis? (Select all that apply) - Selected Choice - H20 Driverless AI Length:20036 Class :character Mode :character Which of the following automated machine learning tools (or partial AutoML tools) do you use on a regular basis? (Select all that apply) - Selected Choice - Databricks AutoML Length:20036 Class :character Mode :character Which of the following automated machine learning tools (or partial AutoML tools) do you use on a regular basis? (Select all that apply) - Selected Choice - DataRobot AutoML Length:20036 Class :character Mode :character Which of the following automated machine learning tools (or partial AutoML tools) do you use on a regular basis? (Select all that apply) - Selected Choice - Tpot Length:20036 Class :character Mode :character Which of the following automated machine learning tools (or partial AutoML tools) do you use on a regular basis? (Select all that apply) - Selected Choice - Auto-Keras Length:20036 Class :character Mode :character Which of the following automated machine learning tools (or partial AutoML tools) do you use on a regular basis? (Select all that apply) - Selected Choice - Auto-Sklearn Length:20036 Class :character Mode :character Which of the following automated machine learning tools (or partial AutoML tools) do you use on a regular basis? (Select all that apply) - Selected Choice - Auto_ml Length:20036 Class :character Mode :character Which of the following automated machine learning tools (or partial AutoML tools) do you use on a regular basis? (Select all that apply) - Selected Choice - Xcessiv Length:20036 Class :character Mode :character Which of the following automated machine learning tools (or partial AutoML tools) do you use on a regular basis? (Select all that apply) - Selected Choice - MLbox Length:20036 Class :character Mode :character Which of the following automated machine learning tools (or partial AutoML tools) do you use on a regular basis? (Select all that apply) - Selected Choice - No / None Length:20036 Class :character Mode :character Which of the following automated machine learning tools (or partial AutoML tools) do you use on a regular basis? (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character Do you use any tools to help manage machine learning experiments? (Select all that apply) - Selected Choice - Neptune.ai Length:20036 Class :character Mode :character Do you use any tools to help manage machine learning experiments? (Select all that apply) - Selected Choice - Weights & Biases Length:20036 Class :character Mode :character Do you use any tools to help manage machine learning experiments? (Select all that apply) - Selected Choice - Comet.ml Length:20036 Class :character Mode :character Do you use any tools to help manage machine learning experiments? (Select all that apply) - Selected Choice - Sacred + Omniboard Length:20036 Class :character Mode :character Do you use any tools to help manage machine learning experiments? (Select all that apply) - Selected Choice - TensorBoard Length:20036 Class :character Mode :character Do you use any tools to help manage machine learning experiments? (Select all that apply) - Selected Choice - Guild.ai Length:20036 Class :character Mode :character Do you use any tools to help manage machine learning experiments? (Select all that apply) - Selected Choice - Polyaxon Length:20036 Class :character Mode :character Do you use any tools to help manage machine learning experiments? (Select all that apply) - Selected Choice - Trains Length:20036 Class :character Mode :character Do you use any tools to help manage machine learning experiments? (Select all that apply) - Selected Choice - Domino Model Monitor Length:20036 Class :character Mode :character Do you use any tools to help manage machine learning experiments? (Select all that apply) - Selected Choice - No / None Length:20036 Class :character Mode :character Do you use any tools to help manage machine learning experiments? (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character Where do you publicly share or deploy your data analysis or machine learning applications? (Select all that apply) - Selected Choice - Plotly Dash Length:20036 Class :character Mode :character Where do you publicly share or deploy your data analysis or machine learning applications? (Select all that apply) - Selected Choice - Streamlit Length:20036 Class :character Mode :character Where do you publicly share or deploy your data analysis or machine learning applications? (Select all that apply) - Selected Choice - NBViewer Length:20036 Class :character Mode :character Where do you publicly share or deploy your data analysis or machine learning applications? (Select all that apply) - Selected Choice - GitHub Length:20036 Class :character Mode :character Where do you publicly share or deploy your data analysis or machine learning applications? (Select all that apply) - Selected Choice - Personal blog Length:20036 Class :character Mode :character Where do you publicly share or deploy your data analysis or machine learning applications? (Select all that apply) - Selected Choice - Kaggle Length:20036 Class :character Mode :character Where do you publicly share or deploy your data analysis or machine learning applications? (Select all that apply) - Selected Choice - Colab Length:20036 Class :character Mode :character Where do you publicly share or deploy your data analysis or machine learning applications? (Select all that apply) - Selected Choice - Shiny Length:20036 Class :character Mode :character Where do you publicly share or deploy your data analysis or machine learning applications? (Select all that apply) - Selected Choice - I do not share my work publicly Length:20036 Class :character Mode :character Where do you publicly share or deploy your data analysis or machine learning applications? (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character On which platforms have you begun or completed data science courses? (Select all that apply) - Selected Choice - Coursera Length:20036 Class :character Mode :character On which platforms have you begun or completed data science courses? (Select all that apply) - Selected Choice - edX Length:20036 Class :character Mode :character On which platforms have you begun or completed data science courses? (Select all that apply) - Selected Choice - Kaggle Learn Courses Length:20036 Class :character Mode :character On which platforms have you begun or completed data science courses? (Select all that apply) - Selected Choice - DataCamp Length:20036 Class :character Mode :character On which platforms have you begun or completed data science courses? (Select all that apply) - Selected Choice - Fast.ai Length:20036 Class :character Mode :character On which platforms have you begun or completed data science courses? (Select all that apply) - Selected Choice - Udacity Length:20036 Class :character Mode :character On which platforms have you begun or completed data science courses? (Select all that apply) - Selected Choice - Udemy Length:20036 Class :character Mode :character On which platforms have you begun or completed data science courses? (Select all that apply) - Selected Choice - LinkedIn Learning Length:20036 Class :character Mode :character On which platforms have you begun or completed data science courses? (Select all that apply) - Selected Choice - Cloud-certification programs (direct from AWS, Azure, GCP, or similar) Length:20036 Class :character Mode :character On which platforms have you begun or completed data science courses? (Select all that apply) - Selected Choice - University Courses (resulting in a university degree) Length:20036 Class :character Mode :character On which platforms have you begun or completed data science courses? (Select all that apply) - Selected Choice - None Length:20036 Class :character Mode :character On which platforms have you begun or completed data science courses? (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character What is the primary tool that you use at work or school to analyze data? (Include text response) - Selected Choice Length:20036 Class :character Mode :character Who/what are your favorite media sources that report on data science topics? (Select all that apply) - Selected Choice - Twitter (data science influencers) Length:20036 Class :character Mode :character Who/what are your favorite media sources that report on data science topics? (Select all that apply) - Selected Choice - Email newsletters (Data Elixir, O'Reilly Data & AI, etc) Length:20036 Class :character Mode :character Who/what are your favorite media sources that report on data science topics? (Select all that apply) - Selected Choice - Reddit (r/machinelearning, etc) Length:20036 Class :character Mode :character Who/what are your favorite media sources that report on data science topics? (Select all that apply) - Selected Choice - Kaggle (notebooks, forums, etc) Length:20036 Class :character Mode :character Who/what are your favorite media sources that report on data science topics? (Select all that apply) - Selected Choice - Course Forums (forums.fast.ai, Coursera forums, etc) Length:20036 Class :character Mode :character Who/what are your favorite media sources that report on data science topics? (Select all that apply) - Selected Choice - YouTube (Kaggle YouTube, Cloud AI Adventures, etc) Length:20036 Class :character Mode :character Who/what are your favorite media sources that report on data science topics? (Select all that apply) - Selected Choice - Podcasts (Chai Time Data Science, O’Reilly Data Show, etc) Length:20036 Class :character Mode :character Who/what are your favorite media sources that report on data science topics? (Select all that apply) - Selected Choice - Blogs (Towards Data Science, Analytics Vidhya, etc) Length:20036 Class :character Mode :character Who/what are your favorite media sources that report on data science topics? (Select all that apply) - Selected Choice - Journal Publications (peer-reviewed journals, conference proceedings, etc) Length:20036 Class :character Mode :character Who/what are your favorite media sources that report on data science topics? (Select all that apply) - Selected Choice - Slack Communities (ods.ai, kagglenoobs, etc) Length:20036 Class :character Mode :character Who/what are your favorite media sources that report on data science topics? (Select all that apply) - Selected Choice - None Length:20036 Class :character Mode :character Who/what are your favorite media sources that report on data science topics? (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character Which of the following cloud computing platforms do you hope to become more familiar with in the next 2 years? - Selected Choice - Amazon Web Services (AWS) Length:20036 Class :character Mode :character Which of the following cloud computing platforms do you hope to become more familiar with in the next 2 years? - Selected Choice - Microsoft Azure Length:20036 Class :character Mode :character Which of the following cloud computing platforms do you hope to become more familiar with in the next 2 years? - Selected Choice - Google Cloud Platform (GCP) Length:20036 Class :character Mode :character Which of the following cloud computing platforms do you hope to become more familiar with in the next 2 years? - Selected Choice - IBM Cloud / Red Hat Length:20036 Class :character Mode :character Which of the following cloud computing platforms do you hope to become more familiar with in the next 2 years? - Selected Choice - Oracle Cloud Length:20036 Class :character Mode :character Which of the following cloud computing platforms do you hope to become more familiar with in the next 2 years? - Selected Choice - SAP Cloud Length:20036 Class :character Mode :character Which of the following cloud computing platforms do you hope to become more familiar with in the next 2 years? - Selected Choice - VMware Cloud Length:20036 Class :character Mode :character Which of the following cloud computing platforms do you hope to become more familiar with in the next 2 years? - Selected Choice - Salesforce Cloud Length:20036 Class :character Mode :character Which of the following cloud computing platforms do you hope to become more familiar with in the next 2 years? - Selected Choice - Alibaba Cloud Length:20036 Class :character Mode :character Which of the following cloud computing platforms do you hope to become more familiar with in the next 2 years? - Selected Choice - Tencent Cloud Length:20036 Class :character Mode :character Which of the following cloud computing platforms do you hope to become more familiar with in the next 2 years? - Selected Choice - None Length:20036 Class :character Mode :character Which of the following cloud computing platforms do you hope to become more familiar with in the next 2 years? - Selected Choice - Other Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these specific cloud computing products? (Select all that apply) - Selected Choice - Amazon EC2 Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these specific cloud computing products? (Select all that apply) - Selected Choice - AWS Lambda Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these specific cloud computing products? (Select all that apply) - Selected Choice - Amazon Elastic Container Service Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these specific cloud computing products? (Select all that apply) - Selected Choice - Azure Cloud Services Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these specific cloud computing products? (Select all that apply) - Selected Choice - Microsoft Azure Container Instances Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these specific cloud computing products? (Select all that apply) - Selected Choice - Azure Functions Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these specific cloud computing products? (Select all that apply) - Selected Choice - Google Cloud Compute Engine Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these specific cloud computing products? (Select all that apply) - Selected Choice - Google Cloud Functions Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these specific cloud computing products? (Select all that apply) - Selected Choice - Google Cloud Run Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these specific cloud computing products? (Select all that apply) - Selected Choice - Google Cloud App Engine Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these specific cloud computing products? (Select all that apply) - Selected Choice - None Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these specific cloud computing products? (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these specific machine learning products? (Select all that apply) - Selected Choice - Amazon SageMaker Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these specific machine learning products? (Select all that apply) - Selected Choice - Amazon Forecast Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these specific machine learning products? (Select all that apply) - Selected Choice - Amazon Rekognition Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these specific machine learning products? (Select all that apply) - Selected Choice - Azure Machine Learning Studio Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these specific machine learning products? (Select all that apply) - Selected Choice - Azure Cognitive Services Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these specific machine learning products? (Select all that apply) - Selected Choice - Google Cloud AI Platform / Google Cloud ML Engine Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these specific machine learning products? (Select all that apply) - Selected Choice - Google Cloud Video AI Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these specific machine learning products? (Select all that apply) - Selected Choice - Google Cloud Natural Language Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these specific machine learning products? (Select all that apply) - Selected Choice - Google Cloud Vision AI Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these specific machine learning products? (Select all that apply) - Selected Choice - None Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these specific machine learning products? (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - MySQL Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - PostgresSQL Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - SQLite Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Oracle Database Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - MongoDB Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Snowflake Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - IBM Db2 Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Microsoft SQL Server Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Microsoft Access Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Microsoft Azure Data Lake Storage Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Amazon Redshift Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Amazon Athena Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Amazon DynamoDB Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Google Cloud BigQuery Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Google Cloud SQL Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Google Cloud Firestore Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - None Length:20036 Class :character Mode :character Which of the following big data products (relational databases, data warehouses, data lakes, or similar) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character Which of the following business intelligence tools do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Microsoft Power BI Length:20036 Class :character Mode :character Which of the following business intelligence tools do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Amazon QuickSight Length:20036 Class :character Mode :character Which of the following business intelligence tools do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Google Data Studio Length:20036 Class :character Mode :character Which of the following business intelligence tools do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Looker Length:20036 Class :character Mode :character Which of the following business intelligence tools do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Tableau Length:20036 Class :character Mode :character Which of the following business intelligence tools do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Salesforce Length:20036 Class :character Mode :character Which of the following business intelligence tools do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Einstein Analytics Length:20036 Class :character Mode :character Which of the following business intelligence tools do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Qlik Length:20036 Class :character Mode :character Which of the following business intelligence tools do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Domo Length:20036 Class :character Mode :character Which of the following business intelligence tools do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - TIBCO Spotfire Length:20036 Class :character Mode :character Which of the following business intelligence tools do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Alteryx Length:20036 Class :character Mode :character Which of the following business intelligence tools do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Sisense Length:20036 Class :character Mode :character Which of the following business intelligence tools do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - SAP Analytics Cloud Length:20036 Class :character Mode :character Which of the following business intelligence tools do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - None Length:20036 Class :character Mode :character Which of the following business intelligence tools do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character Which categories of automated machine learning tools (or partial AutoML tools) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Automated data augmentation (e.g. imgaug, albumentations) Length:20036 Class :character Mode :character Which categories of automated machine learning tools (or partial AutoML tools) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Automated feature engineering/selection (e.g. tpot, boruta_py) Length:20036 Class :character Mode :character Which categories of automated machine learning tools (or partial AutoML tools) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Automated model selection (e.g. auto-sklearn, xcessiv) Length:20036 Class :character Mode :character Which categories of automated machine learning tools (or partial AutoML tools) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Automated model architecture searches (e.g. darts, enas) Length:20036 Class :character Mode :character Which categories of automated machine learning tools (or partial AutoML tools) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Automated hyperparameter tuning (e.g. hyperopt, ray.tune, Vizier) Length:20036 Class :character Mode :character Which categories of automated machine learning tools (or partial AutoML tools) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Automation of full ML pipelines (e.g. Google Cloud AutoML, H20 Driverless AI) Length:20036 Class :character Mode :character Which categories of automated machine learning tools (or partial AutoML tools) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - None Length:20036 Class :character Mode :character Which categories of automated machine learning tools (or partial AutoML tools) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character Which specific automated machine learning tools (or partial AutoML tools) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Google Cloud AutoML Length:20036 Class :character Mode :character Which specific automated machine learning tools (or partial AutoML tools) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - H20 Driverless AI Length:20036 Class :character Mode :character Which specific automated machine learning tools (or partial AutoML tools) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Databricks AutoML Length:20036 Class :character Mode :character Which specific automated machine learning tools (or partial AutoML tools) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - DataRobot AutoML Length:20036 Class :character Mode :character Which specific automated machine learning tools (or partial AutoML tools) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Tpot Length:20036 Class :character Mode :character Which specific automated machine learning tools (or partial AutoML tools) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Auto-Keras Length:20036 Class :character Mode :character Which specific automated machine learning tools (or partial AutoML tools) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Auto-Sklearn Length:20036 Class :character Mode :character Which specific automated machine learning tools (or partial AutoML tools) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Auto_ml Length:20036 Class :character Mode :character Which specific automated machine learning tools (or partial AutoML tools) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Xcessiv Length:20036 Class :character Mode :character Which specific automated machine learning tools (or partial AutoML tools) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - MLbox Length:20036 Class :character Mode :character Which specific automated machine learning tools (or partial AutoML tools) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - None Length:20036 Class :character Mode :character Which specific automated machine learning tools (or partial AutoML tools) do you hope to become more familiar with in the next 2 years? (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these tools for managing ML experiments? (Select all that apply) - Selected Choice - Neptune.ai Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these tools for managing ML experiments? (Select all that apply) - Selected Choice - Weights & Biases Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these tools for managing ML experiments? (Select all that apply) - Selected Choice - Comet.ml Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these tools for managing ML experiments? (Select all that apply) - Selected Choice - Sacred + Omniboard Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these tools for managing ML experiments? (Select all that apply) - Selected Choice - TensorBoard Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these tools for managing ML experiments? (Select all that apply) - Selected Choice - Guild.ai Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these tools for managing ML experiments? (Select all that apply) - Selected Choice - Polyaxon Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these tools for managing ML experiments? (Select all that apply) - Selected Choice - Trains Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these tools for managing ML experiments? (Select all that apply) - Selected Choice - Domino Model Monitor Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these tools for managing ML experiments? (Select all that apply) - Selected Choice - None Length:20036 Class :character Mode :character In the next 2 years, do you hope to become more familiar with any of these tools for managing ML experiments? (Select all that apply) - Selected Choice - Other Length:20036 Class :character Mode :character
# table() function is rather flexible in allowing to tabulate a single variable and do crosstabs
table(kaggle2020[3])
Man Nonbinary Prefer not to say 15789 52 263 Prefer to self-describe Woman 54 3878
# Wrapping it inside prop.table() gives proportions of each category
prop.table(table(kaggle2020[3]))
Man Nonbinary Prefer not to say 0.788031543 0.002595328 0.013126373 Prefer to self-describe Woman 0.002695149 0.193551607
# Wrapping it inside sort() gives value sorting, as opposed to alphabetic (or facto levels)
sort(table(kaggle2020[3]), decreasing = TRUE)[1]
Man 15789