Our Conference

Data Science UA will gather participants from all over the world at the 9th Data Science UA Conference which will be held online on November 20th, 2020.

The conference will last for 24 hours non-stop consisting of three significant tracks: Technical track, Workshops track, and Business track. 

Speakers from TOP companies Amazon, Facebook AI, Airbus, Nvidia, Google, IBM and others are going to share experiences and discuss as much as possible about how AI transforms the world today and what is going to be tomorrow. 

There will be 3000+ participants and 70 speakers from the world’s best companies.

At the conference, you will learn step-by-step algorithms through practical workshops as well as insights which you will be able to bring to life in your own work.

5% of the money for the purchased tickets will be donated for the Charity campaign of Group of Active Rehabilitation.

Let’s build Data Science Community together!

Report language: English

0 +

speakers

0

hours

0 +

participants

0

workshops

0

tracks

Technical track

  • How to organize the workflow of Data Science projects? 
  • Diving deep into the raw data.
  • How to identify critical features with data analysis?
  • Working with data analysis models: how to optimize and validate solutions?
  • How to create data solutions and integrate them into existing products?
  • Problems of education and raising expertise.

Business track

  • How to pave the path from problem to solution in the product? 
  • How to solve issues with the implementation and feasibility of Data Science in various fields?
  • What difficulties can you meet during team formation and how to tackle them?
  • How to estimate costs and evaluate projects?
  • Understanding the models and algorithms: how to work with them and what to expect?

Workshops track

  • Discover real practical insights in a short time. 
  • Get guidance from leading engineers.
  • Have hands-on experience with Data Science tasks and tools.
until 9th Data Science UA Conference

November 20th, 2020

Start: 8am GMT

info-circle Julien Simon

Julien Simon

Principal Developer Advocate, AI & Machine Learning at Amazon

End-to-end natural language processing with Amazon SageMaker

close

Julien Simon

Principal Developer Advocate, AI & Machine Learning at Amazon

End-to-end natural language processing with Amazon SageMaker

Technical track

Bio

As a Global AI & Machine Learning Evangelist, Julien focuses on helping developers and enterprises bring their ideas to life. He frequently speaks at conferences, and also blogs on the AWS Blog and on Medium.

Abstract

In this code-level talk, we will start from a large natural language dataset. Using Python and Jupyter, we will first run a batch job on SageMaker Processing in order to clean, stem, and tokenize the dataset. Then, we'll use fully managed infrastructure to train topic modeling models with the Latent Dirichlet Allocation and Neural Topic Modeling algorithms, two built-in algorithms in SageMaker. Finally, we'll deploy both models, we'll run predictions and we'll compare results.

info-circle Shagun Sodhani

Shagun Sodhani

Research Engineer at Facebook AI

A tutorial on Policy Gradients

close

Shagun Sodhani

Research Engineer at Facebook AI

A tutorial on Policy Gradients

Technical track

Bio

Hi! I am Shagun, a Research Engineer with Facebook AI Research. Before that, I was an MSc student at Mila (Quebec Artificial Intelligence Institute) with Prof Yoshua Bengio and Prof Jian Tang. My research focuses on lifelong reinforcement learning - training AI systems that can interact with and learn from the physical world (reinforcement learning) and consistently improve as they do so without forgetting the previous knowledge (lifelong learning). My stack primarily comprises of Python (and related ML/DS/visualization toolkits). I love to play with new technology and look forward to meeting new people at Data Science UA. Website: https://shagunsodhani.com Previous Talks: https://shagunsodhani.com/talks/"

Abstract

Policy Gradient Algorithms is a popular and widely-applicable family of Reinforcement Learning Algorithms. Several state-of-the-art RL algorithms (PPO, SAC, IMPALA, etc.) are variants of policy gradient algorithms. The main idea behind policy gradient algorithms is to learn a parametric policy by directly optimize the policy (instead of optimizing some value functions, as done in value function based methods). This characteristic makes them a natural fit for tasks where the learning agent can choose an action from a continuous range (e.g., controlling the angle when steering a car). However, they are also useful for tasks with discrete action space (like choosing between accelerator, brake, and clutch). In this talk, we will start with the vanilla policy gradient algorithm. While extremely easy to implement, the basic algorithm suffers from high variance in practice (as we will see during the talk). Then we will talk about some "cheap" yet effective methods for reducing the variance in practice. From there on, we will discuss one of the more commonly used (in practice) algorithms called Soft Actor-Critic (SAC) and will walk through a simple SAC implementation. We will conclude the talk with a discussion on IMPALA, a distributed policy-gradient algorithm that can be used to scale RL agents' training for real-life tasks while using lesser data.

info-circle Siddha Ganju

Siddha Ganju

Solutions Architect at NVIDIA

30 Golden Rules of Deep Learning Performance

close

Siddha Ganju

Solutions Architect at NVIDIA

30 Golden Rules of Deep Learning Performance

Technical track

Bio

Siddha Ganju, an AI researcher who Forbes featured in their 30 under 30 list, is a Self-Driving Architect at Nvidia. As an AI Advisor to NASA FDL, she helped build an automated meteor detection pipeline for the CAMS project at NASA, which ended up discovering a comet.

Abstract

“Watching paint dry is faster than training my deep learning model.” “If only I had ten more GPUs, I could train my model in time.” “I want to run my model on a cheap smartphone, but it’s probably too heavy and slow.” If this sounds like you, then you might like this talk. Exploring the landscape of training and inference, we cover a myriad of tricks that step-by-step improve the efficiency of most deep learning pipelines, reduce wasted hardware cycles, and make them cost-effective. We identify and fix inefficiencies across different parts of the pipeline, including data preparation, reading and augmentation, training, and inference. With a data-driven approach and easy-to-replicate TensorFlow examples, finely tune the knobs of your deep learning pipeline to get the best out of your hardware. And with the money you save, demand a raise! Domain level: Beginner.

info-circle Dr. Sergei Bobrovskyi

Dr. Sergei Bobrovskyi

Data Scientist at Airbus

Deep Learning Anomaly Detection

close

Dr. Sergei Bobrovskyi

Data Scientist at Airbus

Deep Learning Anomaly Detection

Technical track

Bio

Dr. Sergei Bobrovskyi is a Data Scientist within the AI Platforms team at Airbus. His work focuses on applications of AI for anomaly detection in time series, spanning various use-cases across Airbus.

Abstract

Many modern products, as well as manufacturing systems, produce large amounts of sensor signals, which cannot be analyzed and even captured in its totality by humans. In this talk, we focus on automatic anomaly detection tasks for sensor data. We assess the industrial viability of various semi-supervised anomaly detection systems based on Deep Learning for automatic discovery of point, contextual, and collective anomalies on large datasets with little prior knowledge.

info-circle Oleksandr Maksymets

Oleksandr Maksymets

Research Engineer at Facebook AI

Embodied AI: Agents that can See, Talk, Act and Reason

close

Oleksandr Maksymets

Research Engineer at Facebook AI

Embodied AI: Agents that can See, Talk, Act and Reason

Technical track

Bio

Oleksandr Maksymets is a research engineer at Facebook AI Research (FAIR) working on embodied agent navigation using deep learning. Co-author and maintainer of Open Source Habitat AI Framework from FAIR that brings community benchmarks to the field and supports simulation to reality transferability. Oleksandr was one of the organizers of Embodied AI challenges and workshops for CVPR 2019/2020.

Abstract

Imagine walking up to a home robot and asking “Hey robot – can you go check if my laptop is on my desk? And if so, bring it to me.” AI Habitat enables training of such embodied AI agents (virtual robots and egocentric assistants) in a highly photorealistic & efficient 3D simulator, before transferring the learned skills to reality. We will talk about the state of the art in training intelligent agents' domain using machine learning and how to scale a model to 30 years of house walking experience.

info-circle Dipanjan (DJ) Sarkar

Dipanjan (DJ) Sarkar

Data Science Lead, Google Developer Expert at Google

Deep Transfer Learning for Natural Language Processing

close

Dipanjan (DJ) Sarkar

Data Science Lead, Google Developer Expert at Google

Deep Transfer Learning for Natural Language Processing

Technical track

Bio

Dipanjan (DJ) Sarkar is a Data Science Lead at Applied Materials, leading advanced analytics efforts around computer vision, natural language processing, and deep learning. He is also a Google Developer Expert in Machine Learning. He has consulted and worked with several startups as well as Fortune 500 companies like Intel and Open Source organizations like Red Hat \ IBM.

Abstract

Handling challenging real-world problems in Natural Language Processing (NLP) includes tackling class imbalance, problem complexity, and the lack of availability of enough labeled data for training. Thanks to the recent advancements in deep transfer learning in NLP, we have been able to make rapid strides in not only tackling these problems but also leverage these models for diverse downstream NLP tasks. The intent of this session is to journey through the recent advancements in deep transfer learning for NLP by taking a look at various state-of-the-art models and methodologies including: - Pre-trained embeddings for Deep Learning Models (FastText with CNNs\Bi-directional LSTMs + Attention) - Universal Embeddings (Sentence Encoders, NNLMs) - Transformers We will also look at the power of some of these models, especially transformers, to solve diverse problems like summarization, entity recognition, question-answering, sentiment analysis, classification with some hands-on examples leveraging, Python, TensorFlow and the famous transformers library from HuggingFace.

info-circle Rich Dutton

Rich Dutton

Head of Machine Learning for Corporate Engineering at Google

How Google Uses AI and ML in the Enterprise

close

Rich Dutton

Head of Machine Learning for Corporate Engineering at Google

How Google Uses AI and ML in the Enterprise

Business track

Bio

Rich Dutton is the Head of Machine Learning for Corporate Engineering at Google, where he leads a team of 15 engineers and data scientists across NYC and Austin. Prior to this role, Rich was a tech lead in Bigtable at Google following a 15 year career working in data and analytics across both tech and finance in the US (New York and Seattle), Europe and Asia. When not working, Rich practices Muay Thai and spends time with his family, in Williamsburg, Brooklyn, including his Mini Australian Shepherd, Radia, and his self-driving car, Trinity.

Abstract

This session will outline how Google’s Corporate Engineering team is using AI and machine learning to spur innovation within Google. Additionally, Rich will identify the work that his team does (the structure, example use cases etc.), and the research that’s driving the work his team does and the democratization of AI (work in ML Fairness, Privacy, Interpretability and AutoML technologies).

info-circle Sandeep Jain

Sandeep Jain

Leader - Data Science at IBM

Impediment to Predictive Maintenance

close

Sandeep Jain

Leader - Data Science at IBM

Impediment to Predictive Maintenance

Business track

Bio

Dr. Sandeep Jain is a General Manager, Advanced Analytics and Optimization at IBM. His experience spans across supply chain planning and optimization, predictive analytics for aerospace, oil & gas, heavy equipment, energy & utility and CPG. He is a Ph.D. from the Indian Institute of Science in OR/Management Science. Sandeep has published papers in various journals and conferences.

Abstract

The general perception about predictive maintenance is reduced equipment downtime, better productivity, and higher utilization which results in reduced cost and increase in revenue. The data scientist, maintenance & operations face challenges due to the quality & volume of data from IoT devices & storage. Sometimes failure data is so sparse which makes it difficult even to train the models. There are some methods which can be used to minimize the risk of failure of implementation of predictive maintenance projects

info-circle Meltem Ballan

Meltem Ballan

Professional Data Science Fellow at General Motors

Front Seat of Autonomous Vehicles: Ethics, Acceptable Use and Computer Vision

close

Meltem Ballan

Professional Data Science Fellow at General Motors

Front Seat of Autonomous Vehicles: Ethics, Acceptable Use and Computer Vision

Business track

Bio

Accomplished technology executive with a unique combination of analytical and leadership expertise developed over 20 years both in industry and academia. A pioneering woman data scientist who has nurtured and mentored hundreds of budding analysts and scientists as a recognized leader and as an advisory board member. Co-founded a technology startup providing a Big-data analytics and ML platform.

Abstract

Day by day we are getting closer to the autonomous vehicle era and we still have a lot of works to do. How and where to use the data, how much control we should give to AI, what use cases will serve humanity, and how we are solving the biologically motivated computer vision. What information can be translated from human brain and vision to the digital world?

info-circle Yehor Morylov

Yehor Morylov

Computer Vision Tech Lead at EverguardAI

Computer Vision for Real-time Anomaly Detection in Steel Manufacturing

close

Yehor Morylov

Computer Vision Tech Lead at EverguardAI

Computer Vision for Real-time Anomaly Detection in Steel Manufacturing

Technical track

Bio

Morylov Yehor is a Computer Vision Tech Lead in Everguard, a company that improves worker’s safety and prevents accidents before they happened. His previous experience includes more than 5 years in Computer Vision and Deep Learning. Master of Computer Science and Artificial Intelligence at NTUU "KPI" IASA.

Abstract

Steel manufacturers suffer significant losses due to accidents on a production line. In the presentation, I will tell how we developed an anomaly detection system that monitors and prevents hot metal bars collision. I will cover approaches that were used from classical computer vision to segmentation networks, data synthesis and anomaly detection.

info-circle Galina Voloshyna

Galina Voloshyna

Data & Analytics IT Director at Coca-Cola

Working with POS data in store & the insights to be gained in promotional optimization

close

Galina Voloshyna

Data & Analytics IT Director at Coca-Cola

Working with POS data in store & the insights to be gained in promotional optimization

Business track

Bio

Galina is a passionate IT leader with 16 years of experience in creating Business Intelligence and Data Analytics solutions to grow FMCG brands in P&G and The Coca-Cola Company. She is a hands-on IT professional who has been managing heavy lifting of digital transformation across a swathe of markets from Europe to China, delivering tangible business results.

Abstract

A significant portion of the budget for many consumer goods companies and retailers goes to promote the products with consumers. Some estimates put this number to as high as 3% of the overall revenue. Therefore, understanding which promotions do have positive ROI and what is their exact effect is of primary concern to many companies. This situation gets even more complicated when we start taking eCom retailers into account the decision about starting or stopping a price promotion need to be made in a split second. In this talk we will cover specific questions and sets of algorithms to unravel the promotional effectiveness analysis, specifically: - Deriving calendar or price promotions for organizations that do not track such information - Understanding volume flows from one product to the other using only point-of-sale information - Predicting the performance of each promotion and therefore optimizing the overall company investment. We will discuss how a combination of trusted algorithms such as clustering, GBDT and time series can help bring tangible benefits to the old problem of consumer promotions.

info-circle Saeed Reza Kheradpisheh

Saeed Reza Kheradpisheh

Data Science and Deep Learning Lecturer at Shahid Beheshti University

Spiking Neural Networks

close

Saeed Reza Kheradpisheh

Data Science and Deep Learning Lecturer at Shahid Beheshti University

Spiking Neural Networks

Technical track

Bio

Saeed is a computational neuroscientist with a Ph.D. in computer science from the University of Tehran. His researches mainly focus on spiking neural networks and the computational models of object recognition in the visual cortex.

Abstract

Saeed will tell us about spiking neural nets (SNN) and their differences and advantages to the traditional mainstream artificial neural nets. He takes us for a tour around the neural coding, neuronal dynamics, neural connectivity, and learning algorithms in SNNs. And, he will show us some examples of SNNs in visual categorization tasks.

info-circle Oleksandr Proskurin

Oleksandr Proskurin

Founder at Machine Factor Technologies

Improving time-series ensemble predictions with Sequential Bootrstrapping. E-Mini S&P 500 futures example

close

Oleksandr Proskurin

Founder at Machine Factor Technologies

Improving time-series ensemble predictions with Sequential Bootrstrapping. E-Mini S&P 500 futures example

Technical track

Bio

Oleksandr Proskurin is a Founder and CIO of Machine Factor Technologies, a company consulting asset managers in financial machine learning applications, and algorithmic trading. His previous experience includes more than 4 years working in the hedge fund industry, researching and implementing volatility and commodity trading strategies using futures, options, and leveraged ETFs.

Abstract

In the lecture we will discuss: 1) Details, motivation and implementation of the Sequential Bootstrap (SB) algorithm. How to apply SB using mlfinlab package. An example comparing standard ensemble algorithm vs Sequential Bootstrap on the example of E-Mini S&P 500 futures. 2) The practitioners face the problem of SB algorithm computation time. We will discuss how to detect in advance whether the SB algorithm will boost out-of-sample performance compared to standard bagging by analyzing the structure of samples autocorrelation and uniqueness histogram.

info-circle Alexandr Honchar

Alexandr Honchar

Entrepreneur, Advisor, and Author in AI at Neurons Lab

The economy of AI

close

Alexandr Honchar

Entrepreneur, Advisor, and Author in AI at Neurons Lab

The economy of AI

Business track

Bio

Alex Honchar has worked on industrial and research AI projects for around 7 years. At the moment he is active as an entrepreneur, applied researcher, and educator. He is co-founder of consulting boutique Neurons Lab, publishes blogs on Medium with more than 1M views and academic articles with more than 100 quotations, regularly speaks at conferences and workshops across Europe.

Abstract

As entrepreneurs, we are interested in inventions and innovations that are profitable firsthand. Of course, on the peak of the hype, you can capitalize even on the mention of the “AI” in the product, but you’re looking for the innovations that have deep long-term values. Hence, we need a framework to translate all these “accuracies of these neural networks” to actual business models with clear costs and opportunities breakdown. In this talk, we will start with the main innovation patterns of industrial revolutions and how they affect global and micro-economies in terms of productivity, quality, speed, scaling, and spread. Then, we will extend with specific cases related only to the AI technologies, since it automates not manual, but cognitive human abilities. Within this framework, we will review celebrated AI use cases in retail (price and demand forecasting, real-time engagement), investment management (portfolio management and risk management), and manufacturing industries (predictive maintenance, quality control). Based on this we will learn how exactly AI can improve processes or do the opposite if calculated wrong / misunderstood the technology.

info-circle Alexandr Arapov

Alexandr Arapov

Co-founder, CPO at 3DLOOK

How data and modeling the human body measurements are shifting the old-fashion apparel industry to on-demand approach

close

Alexandr Arapov

Co-founder, CPO at 3DLOOK

How data and modeling the human body measurements are shifting the old-fashion apparel industry to on-demand approach

Technical track

Bio

Arapov Alexander started his entrepreneurial activity at the age of 20 and by 26 he was already managing the development of Top Fortune 500 e-commerce company. He is a certified product owner and experienced engineering leader with over eight years directing consume, mobile facing product, and project management teams. In 2016, he co-founded 3DLOOK and created an R&D team of 30 team members.

Abstract

1) How analyzing current size charts with modeling of human body data allows brands to understand the problem of size charts 2) How contactless measuring process plus feedback loop helps to train size recommendation algorithms 3) How data modeling of different body shapes helps customers to unlock new business opportunities

info-circle Enrico Santus

Enrico Santus

Senior Data Scientist at Bayer

NLP in Healthcare: Challenges and Opportunities

close

Enrico Santus

Senior Data Scientist at Bayer

NLP in Healthcare: Challenges and Opportunities

Business track

Bio

Enrico Santus is a senior data scientist at Bayer. His academic career includes a postdoc at MIT, in the group of Regina Barzilay, and numerous years spent between Asian (Hong Kong and Singapore) and European (Italy, UK and Germany) universities, working on topics such as NLP in Oncology, Cardiology and Palliative Care. Enrico has also worked on Fake News Detection, Sentiment Analysis and Lexical Semantics. He has published numerous papers in top tier conferences and journals, and several of his works were featured in mass media. He has been invited to talk at the White House and he is the first author of a fact sheet about AI for the American Congress.

Abstract

For many years we have been praising Artificial Intelligence (AI) and all its possible applications. The recent pandemic has, however, shown that despite the incredible advancements achieved in the last decade, we are still unable to fully exploit the potentiality of this technology for our advantage. Starting from this consideration, in this talk I will describe recent researches about how Natural Language Processing - the linguistic branch of AI - can be utilized to improve our healthcare system, increasing its efficiency and efficacy.

info-circle Jay Kachhadia

Jay Kachhadia

Data Scientist at ViacomCBS

Full Stack Data Science: The Next Gen of Data Scientists Cohort

close

Jay Kachhadia

Data Scientist at ViacomCBS

Full Stack Data Science: The Next Gen of Data Scientists Cohort

Business track

Bio

I play with petabytes of data and engineer systems that could see the future with machine learning and help make business decisions for brands like CBS, Comedy Central, Nickelodeon, MTV, Paramount Pictures, and many more. In short, I'm a Data Scientist at ViacomCBS Digital. Also, I am a Data Science Blogger for Towards Data Science with more than 75000 views on my own blogs. I did my masters in Data Science from Syracuse University and hold a bachelors in Computer Engineering from National Institute of Technology, Surat. Back during my undergrad, I was Lead for Google Developers Group NIT Surat where I delivered talks on chatbot architecture and Data Science in Action.

Abstract

Data Science is a fast-changing industry, and there's no longer one specialization that you can do to get into Data Science due to the changing demands of the Industry. Different companies require a different skillset, and I would like to share the know-how of getting into Data Science straight out of school in 2020. I wrote a blog on the same topic, which received more than 50000 views and more than 100 shares on twitter, including mentions from remarkable AI communities and Data Science companies worldwide. Many of the companies indeed need full stack Data Scientist now to build the infrastructure for practicing Data Science or to make Data Products powered by machine learning. In my current role, I work as a full-stack Data Scientist and would like to share what it takes for all the aspirants to break into this field and how full-stack Data Science is done.

info-circle Veronica Tamayo Flores

Veronica Tamayo Flores

Head of Consulting Data Science UA

TBA Soon

close

Veronica Tamayo Flores

Head of Consulting Data Science UA

TBA Soon

Business track

Bio

In 2018, she graduated from IE Business School (Spain) specialization in Business Analytics and Big Data. In the past, she worked in marketing and digital analytics for retail. Veronica manages data science and business intelligence technology projects at companies. The main expertise is business analysis, business translation (a combination of business and technical skills), conducting analytical projects and business development.

info-circle Olexiy Oryeshko

Olexiy Oryeshko

Staff Software Engineer at Google Search

Panel discussion

close

Olexiy Oryeshko

Staff Software Engineer at Google Search

Panel discussion

Bio

Olexiy applies Machine Learning to large-scale user-facing products. Olexiy has improved Machine Learning models and systems used in Web search, YouTube, Play Store, and other Google products. Now, Olexiy applies his experience as a tech lead for an interactive platform for data science and machine learning, used by hundreds of Google engineers. Olexiy earned his MS degree in computer science from Kyiv University in 2004.

info-circle Stevan Rudinac

Stevan Rudinac

Associate Professor Artificial Intelligence/Machine Learning for Business at University of Amsterdam

How multimedia analytics can help solve complex business problems

close

Stevan Rudinac

Associate Professor Artificial Intelligence/Machine Learning for Business at University of Amsterdam

How multimedia analytics can help solve complex business problems

Business track

Bio

Stevan Rudinac is an Associate Professor of Artificial Intelligence for Business at the University of Amsterdam Business School and a guest researcher at the Informatics Institute of the UvA. In his research he aims at enabling large-scale multimedia analytics based on the relevance criteria defined at a higher semantic level, by jointly analysing visual content and the heterogeneous information associated with it, ranging from text, automatically generated metadata and open data statistics, to information about users and their social network. What fascinates him is the potential of artificial intelligence in addressing important societal challenges, such as liveability and security.

Abstract

In recent decades the production of multimedia content exceeded all expectations. Both size and heterogeneity of multimedia collections increased significantly and the datasets featuring hundreds of millions of images, videos, text, information about users, and various metadata are becoming a commonplace. Therefore, it is unsurprising that the experts from practically all spheres of academia and industry are increasingly using this wealth of information for improving their processes and making better informed decisions. In this talk we will showcase our recent work on applying multimedia analytics to solve problems in the domains of urban computing, marketing and creative industries.

info-circle Himanshu Upreti

Himanshu Upreti

Co-Founder & Chief Technology Officer (CTO) at AI Palette

How to use AI in Consumer Food Product Innovation?

close

Himanshu Upreti

Co-Founder & Chief Technology Officer (CTO) at AI Palette

How to use AI in Consumer Food Product Innovation?

Business track

Bio

Himanshu is a highly driven and passionate entrepreneur currently leading the technology vision at Ai Palette as Co-Founder & CTO. Ai Palette is a deeptech AI startup backed by the Singapore Govt. that is revolutionizing the way new consumer products are created today and aiming to put a dent in the huge $4 Trillion Food Industry. Prior to this, Himanshu worked at Visa Inc. right after his graduation from IIT Guwahati and built data products on Visa’s Big Data Platform that enabled a seamless and faster payment experience. Himanshu has spoken previously about data science at Company and College Events, Podcasts, and General Assembly Data Science Course.

Abstract

90% of the new product launches in the CPG (Consumer Packaged Goods) industry fail in the first year. According to an AC Nielsen study, 50% of the products fail because they don’t address broader consumer needs. This is surprising given the amount of money and time that the CPG companies spend in consumer research. But on digging deeper, one realizes the challenges with the current consumer research process. And that’s the exact problem that Ai Palette is solving for the CPG brands to help them identify what next product to launch into the market. At Ai Palette, we have built a cloud-based Artificial Intelligence platform, using which CPG companies can create consumer winning products. The platform gathers insights from consumer digital footprint about food on social media, menus, recipes, retail, blogs, discussion forums, etc. and couples them with the internal company data to arrive at the product attributes and features that address the unmet needs of the end consumers. The patent-pending AI tech of AI Palette is composed of NLP and Computer Vision Stack. In Asia, every region has its own nuances and language complexity and that’s where we have built native language food-trained models for the various Asian geographies (count of over 10 including China, Korea, Thailand, India, Malaysia) to understand the local food preferences. Moreover, people love to share more through images than text these days and that’s where we leverage the Computer Vision models to analyze and identify what you are having along with your McDonald’s Burger.

info-circle Ali Leylani

Ali Leylani

AI Architect, Senior Data Scientist at Atea Sverige

The value and importance of explainability, and why striving towards it is critical for every organisation aiming to capitalize on data with machine learning.

close

Ali Leylani

AI Architect, Senior Data Scientist at Atea Sverige

The value and importance of explainability, and why striving towards it is critical for every organisation aiming to capitalize on data with machine learning.

Technical track

Bio

Ali Leylani – Lead Data Scientist at Atea and Board Member of Stockholm AI. With a strong background in mathematics and theoretical physics, Ali daily helps businesses adopt an objective, data-driven philosophy.

Abstract

Ali will give first give a brief introduction to the field, explaining the difference between predicting and explaining, and then continue to highlight the latest best practices and share lessons learned from real business cases.

info-circle Oles` Petriv

Oles` Petriv

Chief Technology Officer at Reface AI

TBA Soon

close

Oles` Petriv

Chief Technology Officer at Reface AI

TBA Soon

Technical track

Bio

For the last seven years, Oles has been actively researching and developing computer vision and natural language processing systems. He is the author of a machine learning course on the Prometheus platform and an in-depth training course at the ARVI Lab. He has extensive experience in video processing using deep learning methods for detecting objects and actions, predicting image depth maps, semantic segmentation and generating subtitles for images and video studios in Hollywood. Oles has developed one of the first automation systems to control the placement of groceries at the store shelves using neural networks. He led the development of many projects for automated analysis of news in various languages, recognition of entities, analysis of conceptual drift and representation of language structures using machine learning systems.

info-circle Marta Paes Moreira

Marta Paes Moreira

Developer Advocate at Ververica

Building an End-to-End Analytics Pipeline with PyFlink

close

Marta Paes Moreira

Developer Advocate at Ververica

Building an End-to-End Analytics Pipeline with PyFlink

Technical track

Bio

Marta is a Developer Advocate at Ververica (formerly data Artisans) and a contributor to Apache Flink. After finding her mojo in open source, she is committed to making sense of Data Engineering through the eyes of those using its by-products.

Abstract

Stream processing has fundamentally changed the way we build and think about data pipelines — but the technologies that unlock its value haven’t always been friendly to non-Java/Scala developers. Flink has recently introduced PyFlink, allowing developers to tap into streaming data in real-time with the flexibility of Python and its wide ecosystem for data analytics and Machine Learning. In this talk, we'll explore the basics of PyFlink and showcase how developers can make use of familiar tools like interactive notebooks to unleash the full power of an advanced stream processor like Flink.

info-circle Brian Lucena

Brian Lucena

Consulting Data Scientist at Agentero

Workshop track, StructureBoost - a new Gradient Boosting Package

close

Brian Lucena

Consulting Data Scientist at Agentero

Workshop track, StructureBoost - a new Gradient Boosting Package

Workshop track

Bio

Brian Lucena is Principal at Numeristical and the creator of StructureBoost, ML-Insights, and SplineCalib. His mission is to enhance the understanding and application of modern machine learning and statistical techniques.

Abstract

The values of a categorical variable frequently have a structure that is not ordinal or linear in nature. For example, the months of the year have a circular structure, and the US States have a geographical structure. Standard approaches such as one-hot or numerical encoding are unable to effectively exploit the structural information of such variables. In this tutorial, we will introduce the StructureBoost gradient boosting package, wherein the structure of categorical variables can be represented by a graph and exploited to improve predictive performance. Moreover, StructureBoost can make informed predictions on categorical values for which there is little or no data, by leveraging the knowledge of the structure. We will walk through examples of how to configure and train models using StructureBoost and demonstrate other features of the package.

info-circle Borys Pratsiuk

Borys Pratsiuk

Chief Technology Officer at Scalarr

Merge your data science team with your production processes

close

Borys Pratsiuk

Chief Technology Officer at Scalarr

Merge your data science team with your production processes

Business track

Bio

Borys graduated from the Chair of Physical and Biomedical Electronics of the KPI with honors in 2007 on the specialty “Physical and Biomedical Electronics”. Borys works CTO at Scalarr.

Abstract

Your company operates according to well-established rules, but you decided to go into machine learning and optimize it. How will you plan release date, sprint duration? Who will be responsible for model stability on production? When the DevOps team became more important than the DS team? And many other questions I will answer in my presentation related to process optimization and cross-team collaboration improvement. It will be my story.

info-circle Marta Markiewicz

Marta Markiewicz

Head of Data Science at Objectivity

Hack your life with data science

close

Marta Markiewicz

Head of Data Science at Objectivity

Hack your life with data science

Business track

Bio

Head of Data Science at Objectivity with a background in Mathematical Statistics. For about 9 years, she has been discovering the potential of data in various business domains, from medical data, through retail, HR, finance, aviation, real estate, ... She deeply believes in the power of data in every area of life.

Abstract

It’s not a mystery that AI has the power to transform the business. But what about everyday reality? In this talk I would like to show, using examples, how to hack your own life with data science — finance, personal development, health, you name it!

info-circle Badr Ouali

Badr Ouali

Head of Data Science at Vertica

VerticaPy: Scalable in-DB Data Science with Python Front-End

close

Badr Ouali

Head of Data Science at Vertica

VerticaPy: Scalable in-DB Data Science with Python Front-End

Technical track

Bio

Badr Ouali works as a Data Scientist for Vertica worldwide. He can embrace data projects end to end through a clear understanding of the “big picture” as well as attention to details, resulting in achieving great business outcomes — a distinctive differentiator in his role.

Abstract

Nowadays, 'Big Data' is one of the main topics in the data science world, and data scientists are often at the center of any organization. The benefits of becoming more data-driven are undeniable and are often needed to survive in the industry. Vertica was the first real analytic columnar database and is still the fastest in the market. However, SQL alone isn't flexible enough to meet the needs of data scientists. Python has quickly become the most popular tool in this domain, owing much of its flexibility to its high-level of abstraction and impressively large and ever-growing set of libraries. Its accessibility has led to the development of popular and performant APIs, like pandas and scikit-learn, and a dedicated community of data scientists. However, Python only works in-memory for a single node process. While distributed programming languages have tried to face this challenge, they are still generally in-memory and can never hope to process all of your data, and moving data is expensive. On top of all of this, data scientists must also find convenient ways to deploy their data and models. The whole process is time-consuming. VerticaPy aims to solve all of these problems. The idea is simple: instead of moving data to your tools, VerticaPy brings your tools to the data.

info-circle Romain Paulus

Romain Paulus

Lead Research Scientist

Semi-supervised and unsupervised abstractive summarization

close

Romain Paulus

Lead Research Scientist

Semi-supervised and unsupervised abstractive summarization

Technical track

Bio

Romain Paulus is a former Lead Research Scientist at Salesforce Research, focusing his work on deep learning for abstractive text summarization and natural language generation. Before that, he was the founding engineer of the California-based startup MetaMind, where he led the full-stack development of a deep-learning-as-a-service platform.

Abstract

Abstractive summarization has been getting a lot of people's attention in the NLP community as a unique unsolved problem. It's a multi-faceted task where a model not only has to understand the main topic of a document, but also write a clear and factually correct summary of it. Moreover, there is limited supervised data available for training summarization models outside of a few specific domains like news. In this talk, we will explore the different ways to train abstractive summarization models with little or no supervised data, and we will discuss how it changes the way we tend to approach complex NLP problems.

info-circle Svetlana Vinogradova

Svetlana Vinogradova

Lead Data Scientist at InsideTracker (Segterra)

Blood Biomarkers Data and Data from Wearables: Insights for Personalized Recommendations

close

Svetlana Vinogradova

Lead Data Scientist at InsideTracker (Segterra)

Blood Biomarkers Data and Data from Wearables: Insights for Personalized Recommendations

Technical track

Bio

Svetlana Vinogradova is a Lead Data Scientist at InsideTracker, leading the Data Science team to integrate blood biomarkers and DNA data with physiological data from activity trackers to improve lifestyle recommendations and discover new patterns and optimal zones in sleep, heart rate, and blood biomarkers.

info-circle Admond Lee

Admond Lee

Consulting Data Scientist, Admond Lee; Contributing Writer, Towards Data Science; KGnuggets

Panel discussion

close

Admond Lee

Consulting Data Scientist, Admond Lee; Contributing Writer, Towards Data Science; KGnuggets

Panel discussion

Bio

With his degree in Physics, Admond discovered and pursued his passion in data science and never looked back ever since. Being a data science communicator at heart, Admond's journey towards data science has been inspiring others. His story and data science work has been featured by various publications, including KDnuggets, Medium, Tech in Asia, AI Time Journal, and business magazines. Besides, Admond has been invited to speak at various workshops and meetups. He is now on a mission to make data science accessible to everyone by helping companies to truly leverage the power of data science to drive business values and guiding students as well as professionals to go into data science field.

info-circle Roman Mogylnyi

Roman Mogylnyi

CEO/Co Founder at Reface AI

TBA Soon

close

Roman Mogylnyi

CEO/Co Founder at Reface AI

TBA Soon

Business track

info-circle Marcel Worring

Marcel Worring

Professor of Computer Science at University of Amsterdam

Interactive Exploration of Multimedia Data

close

Marcel Worring

Professor of Computer Science at University of Amsterdam

Interactive Exploration of Multimedia Data

Technical track

Bio

Prof. dr. Marcel Worring is a professor of computer science at the University of Amsterdam. He has a long research history in multimedia. He is co-author of the renowned “CBIR at the end of the early years paper” and 20 years later still intrigued by the challenge of truly interactive methods for multimedia retrieval and the new opportunities that deep learning brings. He has written over 200 papers in the field with a focus on multimedia analytics, combining multimedia analysis, interaction, and visualization to give people insight in large multimedia collections. He is the director of the Innovation Center for Artificial Intelligence – Amsterdam, a center where universities work together with industry and governmental organizations in joint research labs with a span of five years and at least five Ph.D. students. He is co-directing two such labs, one on techniques to support law enforcement and one on medical imaging and in addition, has research projects on art and city analytics. He has been associate editor of ACM Transactions on Multimedia, IEEE Transactions on Multimedia, and currently is associate editor of IEEE Multimedia. He was co-chair of ACM Multimedia 2016, is the program coordinator of ACM Multimedia 2020, and program chair of ACM ICMR 2021.

Abstract

Multimedia collections in various domains contain a wealth of information. When collections are large, exploring the collection to find the relevant information is far from trivial. Tools are needed to support users in their quest. The dominant mode of access still searches, but interactive exploration requires various activities ranging from search to browsing. Categorization, in which each item receives a membership score provides a unifying framework for many of these tasks and with efficient high-dimensional indexing can now interactively be performed even for very large collections. Next to categories, relations among the items are important and hypergraphs form an elegant way to model them. These are the ingredients for true multimedia analytics systems in which multimedia analysis, visualization, and machine come together to support interactive exploration in an optimal way. In this talk, we highlight the progress made in multimedia analytics, show some of the solutions we developed, and reflect on the way to move forward.

info-circle Jörg Schad

Jörg Schad

Head Of Engineering and Machine Learning at ArangoDB

Workshops track, Building OpenSource Machine Learning Pipelines

close

Jörg Schad

Head Of Engineering and Machine Learning at ArangoDB

Workshops track, Building OpenSource Machine Learning Pipelines

Workshops track

Bio

Jörg Schad is Head of Machine Learning at ArangoDB. In a previous life, he has worked on machine learning pipelines in healthcare and finance, distributed systems at Mesosphere, and in-memory databases. He received his Ph.D. for research around distributed databases and data analytics. He’s a frequent speaker at meetups, international conferences, and lecture halls.

Abstract

There are many great tutorials for training your deep learning models using TensorFlow, Keras, Spark, or one of the many other frameworks. But training is only a small part of the overall deep learning pipeline. This workshop gives an overview into building a complete automated deep learning pipeline starting with exploratory analysis, overtraining, model storage, model serving, and monitoring and answer questions such as: - How can we enable data scientists to exploratively develop models? - How to automatize distributed Training, Model Optimization, and serving using CI/CD? - How can we easily deploy these distributed deep learning frameworks on any public or private infrastructure? - How can we manage multiple different deep learning frameworks on a single cluster, especially considering heterogeneous resources such as GPU? - How can we store and serve models at scale? - What Metadata should be stored in a production setup? - How can we monitor the entire pipeline and track the performance of the deployed models? The participants will build an end-to-end data analytics pipeline including: - Pipeline Orchestration with TFX, Kubeflow, and Airflow - Data preparation - Jupyter Notebooks - Distributed training with TensorFlow - Automation & CI/CD using Jenkins and Argo - Model and metadata storage - Model serving and monitoring

info-circle Tetiana Kodliuk

Tetiana Kodliuk

Chief Science Officer at Dathena Science

Getting Ready for Fast-Changing World: Drifts detection in Data Security

close

Tetiana Kodliuk

Chief Science Officer at Dathena Science

Getting Ready for Fast-Changing World: Drifts detection in Data Security

Technical track

Bio

Tania leads Data Science team at Dathena and is responsible for the innovation and patenting. With the mathematical background, she is passioned about Natural Language Processing, Deep Learning and truly believes in the need for Responsible AI. That boosts the projects on AI Explainability and Auditability for Data Security and Privacy. She is also building a self-driven platform, which she calls “Dathena's brain”, to manage continuous data analysis through autonomous decision-making system.

Abstract

If you know that deployment of the high-quality model in production is not the end of a fairytale like "And they lived happily ever after", you might find this talk interesting. Data Security expects continuous data analysis and risk assessment, which can be successfully achieved by Machine Learning solutions. There is one "But" though: when the models are deployed in production with every-minute data re-scanning, they can lose prediction accuracy over time as real-life data is rarely stationary. Do not believe? Check your emails with Balinese paradise's recommendations when you stuck in your country due to COVID 2019. Here we face the "Data Drift" challenge, which is defined as a change in the distribution of data used in a predictive task. Detecting changes in new incoming data is key to make sure the predictions obtained are valid. We will continue this talk with the solutions, that can help to maintain high-quality data analysis such as Active Learning.

info-circle Jon McLoone

Jon McLoone

Director of Technical Services, Communication and Strategy at Wolfram Research Europe Ltd.

Unified data + unified computation = Multi-paradigm data science

close

Jon McLoone

Director of Technical Services, Communication and Strategy at Wolfram Research Europe Ltd.

Unified data + unified computation = Multi-paradigm data science

Technical track

Bio

As Director of Technical Services, Communication and Strategy at Wolfram Research Europe, Jon McLoone is central to driving the company's technical business strategy and leading the consulting solutions team. Described as “The Computation Company”, the Wolfram group are world leaders in integrated technology for computation, data science, and AI including machine learning. With over 25 years of experience working with Wolfram Technologies, Jon has helped in directing software development, system design, technical marketing, corporate policy, business strategies, and much more. Jon gives regular keynote appearances and media interviews on topics such as the Future of AI, Enterprise Computation Strategies, and Education Reform, across multiple fields including healthcare, fintech and data science. He holds a degree in mathematics from the University of Durham. Jon is also Co-founder and Director of Development for computerbasedmath.org, an organization dedicated to fundamental reform of maths education and the introduction of computational thinking. The movement is now a worldwide force in re-engineering the STEM curriculum with early projects in Estonia, Sweden and Africa.

Abstract

While greater automation has made machine learning and data science tools accessible to non-experts, that same automation is equally important to the expert user. By breaking down the barriers between different kinds of computation a truly multi-paradigm approach to data science becomes possible. This talk will demonstrate Wolfram Research's progress towards a fully unified computation platform including live coded machine-learning, computer vision and production deployment. Making this all possible is an underlying symbolic representation that unifies data, models, code, and interfaces. The talk will explain how this simplifies high-level concepts and enables their automation. Examples will include surgery and transfer learning on a neural network and automated anomaly detection. Jon McLoone is a senior developer with nearly 30 years of experience at Wolfram Research where he leads the technical services team, developing data science solutions for customers from industries ranging from energy to finance, and education to medicine.

info-circle Paweł Zawistowski

Paweł Zawistowski

Lead Data Scientist at Adform

How good is your model?

close

Paweł Zawistowski

Lead Data Scientist at Adform

How good is your model?

Technical track

Bio

Senior Data Scientist working in Adform’s Research, AI & Analytics area and an assistant professor at the faculty of Computer Science, at the Warsaw University of Technology. IT wizard specializing in artificial intelligence methods, especially in the nonlinear issues of regression and classification, which his doctorate concerned. He gained professional experience both in the field of science and research as well as in commercial projects. He has been seriously analyzing and modeling data since 2008 – since then he has participated in various projects, ranging from individual analyzes of small data sets, through the development of regression and classification methods in research projects, to the creation of large-scale production systems using predictive models of hundreds of thousands of times per second.

Abstract

In applied data science, you build your model for some specific purpose. Before you are ready to ship it to production, or hand it to your customer, the question arises: is it ""good enough""? This question is tricky because what it exactly means will vary from project to project. Even for a given case, if you ask different stakeholders, you might get different answers ranging from expected ROI, through good AUC values, to technical aspects like latencies and memory requirements. Yet, it is crucial to get the answer right if you want your model to thrive. This talk will try to address model evaluation widely, touching on subjects like defining acceptance criteria for your model, the importance of baselines, performing evaluation using A/B tests, and other techniques, along with discussing some pitfalls you might encounter. We will talk about the subject from a practical perspective: scenarios that might be not obvious how to evaluate your model, and simple comparisons of standard measures like accuracy or MSE seem not enough.

info-circle Meeta Dash

Meeta Dash

VP of Product at Appen

Every company can be an AI company

close

Meeta Dash

VP of Product at Appen

Every company can be an AI company

Business track

Bio

Meeta is a passionate, customer-obsessed product leader with a track record of launching innovative products that solve real business problems. As VP Product at Appen, she is building a machine learning data platform focused on Computer Vision, Autonomous Vehicles, Conversational AI, and NLP. Prior to Appen, Meeta held several product leadership roles in Cisco Systems, Tokbox/Telefonica, and Computer Associates with a focus on AI, Chatbots, Voice/Video, and Data Analytics.

Abstract

"Artificial Intelligence is set for explosive growth and is impacting the future of every industry and human interaction. With so much hype all-around us, here’s the million-dollar question “Is AI living up to its promise in the enterprise?” The reality is most companies are still struggling to move from experimentation to production and justify the business value for AI products. Unsurprisingly, business leaders and technologists have very different views about the current challenges. How do we bring business and technology together and successfully scale AI projects? I will share with you real-world processes, management techniques, and tools needed for running AI at scale including but not limited to: - Taking a business-first approach to AI - Organizational structure and culture - Successfully moving from prototype to production - Techniques and tooling to effectively train, deploy, monitor & tune machine learning models - Building the AIOps flywheel to make AI core part of your business"

info-circle Olivier Blais

Olivier Blais

Cofounder, VP Data Science at Moov AI

Validate and Monitor Your AI and Machine Learning Models

close

Olivier Blais

Cofounder, VP Data Science at Moov AI

Validate and Monitor Your AI and Machine Learning Models

Technical track

Bio

Olivier is a data science expert whose leading field of expertise and cutting-edge knowledge of AI and machine learning led him to support many companies’ digital transformations, as well as implementing projects in different industries. Olivier is the laureate of the prestigious “30 under 30” prize. He is co-author of a patent for an advanced algorithm that evaluates the creditworthiness of a borrower.

Abstract

You’ve created a wicked AI or machine learning model that changes the way you do business. Good job. But how do you validate your model and monitor it in the long run? Advanced machine learning and AI models get more and more powerful. They also tend to become more complicated to validate and monitor. This has a major impact on the business’ adoption of models. Initial validation and monitoring are not only critical to ensure the model’s sound performance, but they are also mandatory in some industries like banking and insurance. You will learn the best techniques that can be applied manually or automatically to validate and monitor statistical models. Techniques below will be discussed and demonstrated to perform a full model validation: — Techniques used for initial validation. 2-3 topics for post-discussion? Model validation, model monitoring, machine learning use cases in general. What are some infrastructure and languages discussed? This talk is infrastructure agnostic. Python (mostly TensorFlow or PyTorch) What you'll learn? You'll learn a cutting edge framework which you can't find on Google, yet. We'll show DevOps techniques using open source packages. You will learn the best techniques that can be applied manually or automatically to validate and monitor statistical models.

info-circle Eugene Khvedchenya

Eugene Khvedchenya

AI/ML Advisor at VITech

Deep learning for satellite image processing

close

Eugene Khvedchenya

AI/ML Advisor at VITech

Deep learning for satellite image processing

Technical track

Bio

Eugene is an AI/ML consultant with a strong focus on computer vision. He has over 10 years experience in the software development industry. Has strong technical skills and experience in creating high-load applications. During his career he worked in a wide spectrum of domains - from cloud computing to edge devices, from FPGA and C to Python. He's the author of pytorch-toolbelt (https://github.com/BloodAxe/pytorch-toolbelt) library, member of core team of Albumentations library (http://albumentations.ai/) and contributor to Catalyst (https://catalyst-team.github.io/catalyst/) DL framework. Kaggle Master, ranked Top-100 Kaggle rating worldwide. Author of "Mastering OpenCV for practical computer-vision projects"

Abstract

Satellites generate tremendous amounts of data every day and it helps to spot wildfires, coastline erosions, buildings damage in natural disasters and concentration of nutrients in farm crops from the sky. In this talk I will explain how deep learning can solve these problems. We go through recent data-science competitions on satellite imagery and analyze know-hows of the top-performing solutions. For better experience, attendees should have some prior experience with deep learning and image segmentation.

info-circle Yann Landrin-Schweitzer

Yann Landrin-Schweitzer

Founder and CEO at Stealth Startup

Mathematically defensible data privacy as a Data Science accelerator.

close

Yann Landrin-Schweitzer

Founder and CEO at Stealth Startup

Mathematically defensible data privacy as a Data Science accelerator.

Business track

Bio

After an early start in mechanical engineering, Yann has been working on Data Science, AI and Data Engineering at scale since 2003. He has applied these skills in various industrial contexts, in several social media startups, in advertising giants like Yahoo, and digital content creators and distributors like Netflix.

Abstract

2020 is the time for data privacy. Today, it is complicated for customers to ensure their privacy is maintained, and complicated for companies to use data safely and deliver on privacy expectations. As a result, opportunities to use data for good are missed, at the same time as data misuse is rampant, and data teams struggle to get value out of their data. Data Science teams end up regarding privacy as difficult, obscure and frustrating, an obstacle between them and achieving the goals they are given in their organization. And business leaders end up seeing privacy as purely an exercise in risk mitigation, rather than something that can be a competitive advantage.

info-circle Olga Petrova

Olga Petrova

Machine Learning DevOps Engineer at Scaleway

Active learning: how to reduce the amount of data that needs to be labeled

close

Olga Petrova

Machine Learning DevOps Engineer at Scaleway

Active learning: how to reduce the amount of data that needs to be labeled

Technical track

Bio

Olga is a deep learning R&D engineer at Scaleway, the second-largest french cloud provider. Previously, she received her Ph.D. in theoretical physics from Johns Hopkins University and spent several years working as a quantum physicist. Olga’s current interests focus on semi-supervised and active machine learning.

Abstract

Most of the recent advances in the deep learning field come at a high price. The costs involved in developing and training these models are two-fold: namely, they can be attributed to computing power and training data. Computational resources are getting increasingly more affordable through widespread cloud computing services. On the other hand, gathering and especially manually labeling data cannot scale in the same way. A common scenario is that in which unlabeled data comes cheap, but the labeling budget is severely limited. Practice shows that all data is not created equal: the choice of which data is prioritized to be labeled has a profound effect on the final performance of the resulting model. The task of determining which data samples would be most "informative" when labeled, goes under what is known as active learning. In this talk, I will present an overview of the active learning approach that is applied to an image classification problem.

info-circle Colin Gillespie

Colin Gillespie

Data Scientist at Jumping Rivers

Enforcing Standards in a Data Science Workflow

close

Colin Gillespie

Data Scientist at Jumping Rivers

Enforcing Standards in a Data Science Workflow

Technical track

Bio

Dr. Colin Gillespie is a Senior Lecturer (Associate Professor) at Newcastle University, UK, and the co-founder of Jumping Rivers. His research interests are high-performance statistical computing and Bayesian statistics. He has given talks at a variety of conferences, including useR, RStudio, the Turing Institute, and ODSC.

Abstract

Many R workflows revolve around packages and git. Typically, they use some form of continuous integration, such as Travis, or Gitlab CI. The general idea is that R developers are notified if a commit causes the package to fail some checks. This talk will describe the additional rigorous steps that we apply to our checks via the integrated package. Using this package, allows us to standardize code style, catch errors quicker, and produce more readable commits. We will highlight that while imposing these tests can initially slow down progress on a project, overall they lead to a more robust product.

info-circle Vladyslava Tyshchenko

Vladyslava Tyshchenko

Data Analyst at Softserve

Detecting Biomarkers of Aging using Machine Learning Algorithms

close

Vladyslava Tyshchenko

Data Analyst at Softserve

Detecting Biomarkers of Aging using Machine Learning Algorithms

Technical track

Bio

Vladyslava obtained her BS and MS degrees in Software Engineering with honors from Dnipro National University. At SoftServe she works on the NLP projects for various industries. She is passionate about computational biology and is involved into the research projects on biology of aging where she applies machine learning algorithms. She has experience in working with biomedical texts, metagenomic and transcriptomic data.

Abstract

Recent advances in accuracy and diversity of machine learning and deep learning algorithms push researchers around the world to apply them to variety of fields. One of such fields is biogerontology, where scientists are trying to uncover aging-related questions like "why do we age" or "can we age slower" from the biology point of view using recent computational methods. In this talk, we will find and analyze potential biomarkers of aging having the gene expression dataset. We will go through important tips that one should know while analyzing genomic data, like data transformations, choosing the right model, stability of feature selection, model explanation and interpretation of the results.

info-circle Ravi Ilango

Ravi Ilango

Senior Data Scientist at a startup in stealth mode

Using PreTrained NLP Models and Machine Intelligence/AI for automation

close

Ravi Ilango

Senior Data Scientist at a startup in stealth mode

Using PreTrained NLP Models and Machine Intelligence/AI for automation

Technical track

Bio

Currently working as Founding Team Member and Sr Data Scientist at a Startup in the Stealth model. Passionate about developing deployable deep learning solutions.

Abstract

Natural Language Processing (NLP) is one of the fast-growing segments of deep learning/AI, revolutionizing operational efficiencies in service businesses. NLP focuses on understanding languages and uses a variety of techniques/tools ranging from Data Engineering (NLP Pipelines), Data Science, GPUs, and Pre-trained Deep Learning models. Top AI companies are using NLP to implement solutions to provide a quantum leap in operational efficiencies in service industries. This session will focus on using Pre-trained NLP models (BERT, GPT2), and will include a demo of NLP pipeline, PyTorch framework and spaCy.

info-circle Michael Grogan

Michael Grogan

Data Scientist - TensorFlow and Time Series Specialist at Self-Employed

Predicting Hotel Cancellations with Machine Learning

close

Michael Grogan

Data Scientist - TensorFlow and Time Series Specialist at Self-Employed

Predicting Hotel Cancellations with Machine Learning

Business track

Bio

Michael Grogan is a data scientist with expertise in TensorFlow and time series analysis. His educational background is a Master’s degree in Economics from University College Cork, Ireland. Much of his work has been in the domain of business intelligence; i.e. using machine learning technologies to develop solutions to a wide range of business problems.

Abstract

Hotel cancellations can cause issues for many businesses in the industry. Not only do cancellations result in lost revenue, but this can also cause difficulty in coordinating bookings and adjusting revenue management practices. This session will provide a high-level analysis of different feature selection and classification tools, methods for dealing with imbalanced datasets, along with interpretable machine learning models. Time series modeling techniques will also be discussed. This will include models such as ARIMA and LSTM, along with structural time series modeling using the TensorFlow Probability library.

info-circle Diego Hueltes

Diego Hueltes

Machine Learning Manager at RavenPack

From the Earth to the Moon: Lessons from the space race to apply in Machine Learning projects

close

Diego Hueltes

Machine Learning Manager at RavenPack

From the Earth to the Moon: Lessons from the space race to apply in Machine Learning projects

Business track

Bio

I am the Machine Learning Manager at RavenPack, in Marbella, (Málaga, Spain). I’m a teacher in the Big Data & Analytics master for ESESA IMF, an Antonio de Nebrija University title. I also collaborated teaching in the Big Data Executive Program at Escuela de Organización Industrial (EOI), a Spanish business school where I have been also a Big Data mentor.

Abstract

The space race was an EEUU — Soviet Union competition to conquer the space. This competence helped to develop space technology in an incredible manner, developing other derivative technologies as a side effect. This race was full of success on both sides, achieving goals that seemed impossible in record time. From this space race, we can learn some lessons that we can apply to our Machine Learning projects to have a bigger success rate in a limited amount of time.

info-circle Ylan Kazi

Ylan Kazi

VP, Data Science + Machine Learning at UnitedHealth Group

How AI Will Decide Your Fate

close

Ylan Kazi

VP, Data Science + Machine Learning at UnitedHealth Group

How AI Will Decide Your Fate

Business track

Bio

Ylan Kazi is the Vice President, Data Science and Machine Learning for Unitedhealthcare based in Minnetonka, Minnesota. He leads a team of high performing data scientists focusing on improving health outcomes for Medicare patients. Ylan is skilled at leading data science teams that apply machine learning to solve business challenges and deliver business value. He is active in the data science community and serves as an AI/ML advisor to Smart Steward, a company that provides solutions to combat antibiotic resistance and COVID-19. Ylan also writes about how AI will affect humanity at discoveringai.com.

Abstract

My presentation will cover how AI is starting to control how people will behave and how it integrates into our lives. Everything from getting a new job, to the legal system, to social media is affected by AI and we are starting to give more control to AI to make these decisions. I will also show how this can affect us (both positively and negatively) and what people can do about it.

info-circle Mark Kurtz

Mark Kurtz

Machine Learning Lead at Neural Magic

Pruning Neural Networks for Success

close

Mark Kurtz

Machine Learning Lead at Neural Magic

Pruning Neural Networks for Success

Technical track

Bio

Experienced Software and Machine Learning Leader with a demonstrated history in the internet industry. Proficient across the full stack for engineering and machine learning. Strong engineering professional with a Master’s Degree focused in Robotics Engineering from Washington University in St. Louis.

Abstract

"According to a recent survey, 59% of data scientists are not optimizing their deep learning models for production, despite the performance gains techniques like quantization and pruning can offer. This is no surprise. Model optimizations are hard. But we are here to tell you that we found a way to make model optimizations easy. Join our webinar on May 28 to: Get and overview of pruning, including benefits and downsides Discover new tools that make pruning easy and successful Learn how to prune models for performance in production Understand pruning techniques that result in lower deployment costs.

Oleg Boguslavskyi

Oleg Boguslavskyi

Co-Owner Data Science UA

Jane Klepa

Jane Klepa

Executive Director 1991 Open Data Incubator

Daniel Che

Daniel Che

Founder at CHE guerrilla agency

Dennis Lyubyvy

Dennis Lyubyvy

Lead Data Scientist at Foundry.ai

Get tickets

Smart
16.09 − 31.10
Last chance
from 1.11

A few days before the event, we will send you access to the streaming system.

All reports will be available for participants after the end of the Сonference. We will send them within a few days after the event.

5% of the money for the purchased tickets will be donated for the Charity campaign of Group of Active Rehabilitation.

Discounts

25% — for students and teachers. In order to get a discount promo code, please send a photo of a student ticket/document to

info@data-science.com.ua

5% — from 2 tickets
7% — from 3 tickets
10% — from 5 tickets

Partners

Track partner

Smart partners

Exclusive Office Partner

Infopartners

ITVDN
CBS
Data Art logo

FAQ

What is Data Science UA?

Data Science UA  is a Ukrainian company that was established in 2016 in Kyiv, Ukraine.

 

Over the years we’ve built an ecosystem around the community of 5000+ professionals in data science and AI, which allows us to provide:
– high-quality recruitment (over 150+ closed senior-level positions);
– consulting for companies in Ukraine and around the world
– mentorship programs;
– opening AI R&D Centers.

 

We’ve organized 8 international conferences Data Science UA with 6000+ attendees and now launching the new format- International online 9th Data Science UA Conference.

Is there any ticket discount?

We provide a 25% ticket discount for students and teachers. In order to get a discount promo code, send a photo of a student card or a document confirming that you are working as a professor to the info@data-science.com.ua. We will send you the promo code, that has to be applied when buying the ticket. Group discounts: 5% — from 2 tickets, 7% — from 3 tickets, 10%— from 5 tickets. Choose the desired number of tickets on the ticket sales page. The discount will apply automatically. Please note that discount are not applied to ‘First 100’ price.

How to get an invoice for the ticket purchase?

To purchase tickets with cashless payment, send an e-mail to info@data-science.com.ua with the necessary information:

– The legal name of the company

– Personal info to create a ticket (name, last name, phone, position, mail)

– Requisites and number of tickets

Discounts

5% — from 2 tickets, 7% — from 3 tickets, 10%— from 5 tickets.

25% ticket discount for students and teachers.

I am looking for a job, can you help me?

Send your CV to cv@data-science.com.ua, we might know some projects you will like.

We are always ready to answer your questions, advise or direct you. We love working with people so that the interaction is as effective and comfortable as possible.

Could I make the first step to establishing an AI R&D office in Kyiv, Ukraine?

We setup AI R&D centers in Ukraine and provide full support for its startup and operations. Powered by our largest DS&AI community we can hire the engineering team and get it going in a matter of weeks.
We know almost all top talent engineers in Ukraine in person, so we are always the first to know about new job opportunities or job seekers. This allows us to help engineers finding interesting projects, and companies have the ability to use our recruiting services and find talented professionals within 2 weeks from the start of the search.
Many companies such as Ring, Grammarly, Samsung, DataRobot, Snap already have their R&D centers here, as Ukraine is the 1-st software development destination in Central and Eastern Europe, the 4-th largest exporter of IT products and services in the world.

I am looking for a data scientist, can you help me?

Please send your inquiry to info@data-science.com.ua and we will give you the answer.

We help companies and individuals all over the world to design and implement solutions for data-driven decision making