Data Science User Group (Virtual Meeting): Anomaly Detection with H2O.ai
- Shared screen with speaker view

02:39
Hi folks! We’ll get started here in about 8 minutes

02:51
Thanks for joining early 🙂

16:18
You can see the live meeting notes here: https://otter.ai/u/deCTpIB5JJaLu8JAG4bIqhHqcGQ?utm_source=va_chat_linkScroll back to review anything you missed.

21:36
Here are the direct links to the resources and events mentioned!

21:38
Join the Data Science Chapter: https://usergroups.snowflake.com/data-science/Join the Data Science Discussion: https://community.snowflake.com/s/group/0F93r000000XcZ8CAK/data-scienceFind Your Local User Group Chapter: https://usergroups.snowflake.com/chapters/BLOG | Bringing Enterprise-Grade Python Innovation to the Data Cloud: https://www.snowflake.com/blog/snowpark-python-innovation-available-all-snowflake-customers/WEBINAR | Building Scalable Feature Engineering Pipelines: https://www.snowflake.com/webinar/thought-leadership/building-scalable-feature-engineering-pipelines-with-snowpark-python-and-scikit-learn/BUILD | The Data Cloud Dev Summit: https://www.snowflake.com/build/EVENT | Data Cloud World Tour: https://www.snowflake.com/data-cloud-world-tour/

22:47
How do we access the recording of this session in the future?

23:28
Hi Erika! The recording will be posted to the event page as well as the discussion group.

23:41
Thank you

26:19
Does anyone have any questions or action items?Let's capture them in the meeting notes: https://otter.ai/u/deCTpIB5JJaLu8JAG4bIqhHqcGQ?utm_source=va_chat_link

29:35
I have a question, how would describe integrating the statistical analysis, such as hypothesis testing into the machine learning model development? I currently use SAS Studio.

29:45
Can we use anomaly detection to find data quality issues ?

30:10
I'm working with a sequence of pressure data in a system that can't really be labeled -- trying to use dimensionality reduction then clustering to find anomalies

32:14
Great questions, we'll get to these in the next pause!

32:46
Thanks for sharing your use case, Brent! Would love to hear more.

32:56
How to find anomalies in grouped data -- for example let us say we have age groups of customers and their sales and we are interested to find anomalies in each group.

36:26
Does the data have to be loaded to h20 cloud? Can all the process be done on premise or on snowflake?

41:53
so h20 is basically like a python library and we just use the models in h20 and do the regular python coding to train,test,deploy and everything

42:26
This looks like a time series data - why to choose to not include the date field in the model?

43:12
can I use H2O as a repository and source of truth for my data and models?

43:38
Is there an approach to detecting anomalies in semi-structured and unstructured data?

44:26
Got it!

44:57
Thanks. I know this is a hard question.

45:41
Kurt dimensionality reduction techniques to reduce the search space can definitely help -- depends on how sparse the data set is

46:47
Akshay and Alejandro we will get to your questions about H2O momentarily!

48:29
How would you suggest to pick the hyper-parameters for Random Forest?

51:04
do you have any features in H2O to train these models

52:05
Are there any data wrangling features in this AI ecosystem?

57:37
Is this a paid service?

57:46
do you have the connector for the BI tools

57:47
the driverless AI?

57:51
like MicroStrategy

59:28
are you going to share the slice please?

59:39
Do you have features that lead from anomaly detection to active learning/data labeling as far as model retraining goes?

59:53
Slides

01:00:45
can you share the GitHub project you referenced pls. Thank you.

01:01:48
Does H20 has the connector for BI tools

01:02:12
Is driverless AI is like dataiku drag and drop method? No coding involved like the demo showed earlier?

01:02:46
where are the feature store been done ?

01:02:51
thanks for answering my question

01:08:14
Very insightful presentation!

01:08:18
Awesome thanks

01:09:01
https://github.com/h2oai/h2o-tutorials/tree/master/best-practices

01:09:17
Thank you!

01:09:30
Thank you!!

01:10:17
Very good presentation

01:10:42
Thank you !!!

01:10:42
thank you megan!

01:10:44
Thank you for great presentation!

01:10:54
Thank you so much!!!

01:10:54
https://github.com/h2oai/h2o-tutorials/tree/master/best-practices

01:11:00
Thank you very much

01:11:01
Thank you

01:11:05
Join the Data Science Chapter: https://usergroups.snowflake.com/data-science/Join the Data Science Discussion: https://community.snowflake.com/s/group/0F93r000000XcZ8CAK/data-scienceFind Your Local User Group Chapter: https://usergroups.snowflake.com/chapters/BLOG | Bringing Enterprise-Grade Python Innovation to the Data Cloud: https://www.snowflake.com/blog/snowpark-python-innovation-available-all-snowflake-customers/WEBINAR | Building Scalable Feature Engineering Pipelines: https://www.snowflake.com/webinar/thought-leadership/building-scalable-feature-engineering-pipelines-with-snowpark-python-and-scikit-learn/BUILD | The Data Cloud Dev Summit: https://www.snowflake.com/build/EVENT | Data Cloud World Tour: https://www.snowflake.com/data-cloud-world-tour/

01:11:27
Thank you Megan, Elsa.

01:11:30
thank you

01:11:35
Thank you!