Be A Hero: Transforming GoPro Analytics Data Pipeline
At GoPro, we not only have great camera, but also produce great software ecosystem for capture, editing, sharing and manage content. We collect a lot of data from camera, drone, web, desktop, social media, external tools. The rapid increase data also creates many challenges for GoPro data pipeline: Scalability, operation, resource contention, machine learning support, visualization delivery and cost management. In this talk, we will discuss many of the challenges, the direction we are moving to and the initial steps to address some of the challenges using Spark, Kafka, S3, leveraging elastic cloud platform and dynamic schema.
Making Data Work for Marketers
With the advent of the digital, the way people consume media has been proliferating. This explosive growth of media in past few decades has made a marketer’s job quite challenging. The marketers are always looking for an effective and efficient mix of media to reach their potential customers. One of the most critical decision a CMO takes is how to allocate her marketing budget among different media to drive higher awareness and sales of her company’s product & services. We have been using a number of datasets, tools, and data science methods to help these marketers make better media decisions. This talk will describe the approaches we have seen to work in improving marketing decisions. Making a tangible impact towards solving many business challenges often don’t not much to do with data.
Real Time Anomaly Detection & Analytics
Uri Maoz is the Head of Product and Business Development for Anodot, based in San Francisco, California. Uri has over 15 years of experience in the Software Industry, in which he led Product Management, R&D and Business Development. He was in charge of the development and implementation of several Analytics and Machine Learning products. Uri is passionate about driving innovation from seed to product launch while creating an exceptional user experience and products that customers love.
Graph Walks & Vector Embeddings: Exploiting The Head And Exploring The Tail
Pinterest has the world’s largest catalog of human curated ideas. We’re building a visual discovery engine with 100+ billion ideas, collected by 175+ million people worldwide. As we work to match the right Pin to the right person at the right time, personalization is crucial. Random graph walks with restart are an excellent way to surface popular, high quality, relevant content. But we can also show you great ideas you may not even have known you were looking for - and that’s where vector embedding comes in. We embed you and these billions of ideas in a 128 or 256 dimensional space. Then we project them down into 1000 bits, cut them up into 16 bit chunks, index these chunks, and then find these ideas for you really fast using core search technology.
Data Science Goes to Hollywood
Netflix will spend six billion dollars this year on content, making the company a major player in Hollywood. An increasing portion of this spend will be on original shows such as House of Cards, and original movies such as Beasts of No Nation. As we continue to expand our involvement with Hollywood, we want to leverage data and data science to make the best decisions possible. This talk will explore areas where we see the most opportunity to apply data science to Hollywood, and some early approaches we've taken.
The Neural Networks Behind Speech and Language
This presentation will discuss how deep learning is leading innovation in state of the art speech recognition for multiple languages at Facebook.
The Funny Side of Data Science
Everyone knows that eHarmony studies marriages to find out what features of individuals and couples predict the best relationships. What you may not know, however, is that eHarmony has also spent years using complex machine-learning algorithms to predict which couples are most likely to talk to each other online so that we can help our users actually get to that first date. Steve will discuss eHarmony's research at what makes for a great first date, and our first study is focusing on that funniest part of your personality: Your sense of humor.
NLP for Conversational Assistants
In this presentation, Zonitsa Kozareva will discuss how Natural Language Processing is used in Amazon's conversational assistant, Alexa
Productionize Machine Learning: Learning To Rank At Turo
With hundreds of thousands of cars available on Turo, how can relevance be defined? What are the best cars to show? This talk will cover the full lifecycle of a Machine Learning model from data collection to its deployment in production.
Data Science in Inventory Management @WalmartLabs
Wal-Mart has a couple of million items available through its e-commerce operations, with a over a million of these eligible for free 2-day shipping. The sheer scale of the problem makes optimizing the supply chain to keep costs low, while ensuring delivery SLAs are met, an extremely daunting challenge. This leads to numerous data science problems, ranging from demand forecasting (including modeling promotion) to inventory optimization. We describe a few of these problems, along with the innovative machine learning techniques used to tackle them.
Open Source ML Systems That Need To Be Built
Machine learning has taken the world by a storm in the last few years and yet it is widely believed that it's just the beginning. It is a very exciting time for people working on machine learning systems — after all, we all collectively are still discovering the best abstractions, systems, and tools to tackle ML problems. We, at Quora, have been running complex ML systems of various types in the production at scale for a few years now. I also lead the team that is building out Quora's ML Platform. In this talk, I'd draw from this experience and discuss some ML systems, which in my view could be broadly useful to a lot of people and to the best of my knowledge don't yet exist in the open source world. The ulterior goal of my talk is just to mobilize the community and get the discussion going with the hope that it might help all of us together push the state of the art.
Deep Learning and its Application in Trading and Financial Risk Management
In this session, we explore the rise and poularity of deep learning and its relationship to advance analytics. First, we start with an introduction, lineage and the intuition behind deep nets. Second, We explore some of the popular deep learning use cases before examining an actual working application of deep nets in trading & finance. Finally, we examine pro and cons of various deep learning frameworks and how to go about choosing one that makes sense for your project.
Panel - How Real Businesses Can Use Machine Learning
According to global consulting firm Accenture, intelligent automation powered by machine learning processes was 2016’s biggest tech trend. Such technology has huge implications for organizations across all industries though, and the wealth of structured and unstructured data that companies now hold, along with the rise in affordable products, means that machine learning and AI automation are the future not to be ignored. This panel session will discuss the new capabilities made possible through effective machine learning processes, and explore solutions related to privacy, governance and growing a team to drive the them forward. Panelists include: Paolo Massimi, Head of Rider Data Science, Uber Saket Kumar, Chief Data Scientist, Google Waibhav Tembe, Data Science Leader, Unisys Anmol Rajpurohit, Software Engineer, Salesforce
Pre-Summit Registration & Light Breakfast
Machine Learning Infrastructure in Production: A Case Study at Lyft
This talk will survey a few applications of machine learning in production at Lyft, focusing on the interface between data science code and application code. It will highlight a few common challenges, approaches we've taken to solving them, and learnings along the way.
Data Science to Combat Fraud
Hear how Square uses data science to identify anomalies in the vast number of transactions that are processed every minute in order to effectively combat fraud and optimize performance.
GB-CENT: Gradient Boosted Categorical Embedding & Numerical Trees
Latent factor models and decision tree based models are widely used in tasks of prediction, ranking and recommendation. Latent factor models have the advantage of interpreting categorical features by a low-dimensional representation, while such an interpretation does not naturally t numerical features. In contrast, decision tree based models enjoy the advantage of capturing the nonlinear interactions of numerical features, while their capability of handling categorical features is limited by the cardinality of those features. Since in real-world applications we usually have both abundant numerical features and categorical features with large cardinality (e.g. geolocations, IDs, tags etc.), we design a new model, called GB-CENT, which leverages latent factor embedding and tree components to achieve the merits of both while avoiding their demerits. With two real-world data sets, we demonstrate that GB-CENT can effectively (i.e. fast and accurate) achieve better accuracy than state-of-the-art matrix factorization, decision tree based models and their ensemble.
TensorFlow is the most popular deep learning library currently. This talk will give you an overview of TensorFlow's computation model, setting up graphs, and running them. The talk will also show building a deep learning network in less than 20 lines of code.
Bringing Gaming, AR and VR to Life with Deep Learning
Game development is a complex and labor-intensive effort. Game environments, storylines, and character behaviors are carefully crafted requiring graphics artists, storytellers, and software to work in unison. Often games end up with a delicate mix of hard-wired behavior in the form of traditional code and somewhat more responsive behavior in the form of large collections of rules. Over the last few years, data-intensive machine learning solutions have obliterated rule-based systems in the enterprise -- think Amazon, Netflix, and Uber. At Unity, we are exploring the use of deep learning in content creation and deep reinforcement learning in character development.
Machine Learning in Airbnb
Airbnb is a two-sided marketplace connecting guests to hosts. Machine learning is used on traditional problems like Ranking and Recommendation systems, but also in domains like Risk, Trust and Safety and Growth marketing.
The Science Behind Amazon Alexa
In this presentation, Francois Mairesse will discuss how machine learning informs decision-making in everything Amazon does, in particular with regards to how Alexa uses speech-to-text to provide a revolutionary customer experience.
Applying Deep Learning to Medicine: Genomics & Medical Imaging Applications
In this presentation, Ryan will discuss his work at Google in leveraging deep learning techniques and methodologies to enhance medical care through genomics and medical imaging applications.
Graph Theory for a Stratified Recommendation System in the Video Game Industry
Graph theory was mostly made popular by social networks. It's the study of graphs, mathematical structures used to model pairwise relations between objects. A simple example is a network of friends who are connected with each other in a community. But a graph can be a portfolio of games, virtual items, or even game features... During this presentation you will learn how this powerful approach can be used to radically change customer relationship management for a video game company.
Machine Learning To Combat Payments Fraud And Account Takeover
Coinbase is the one of the largest digital currency exchanges in the world. We store about $3B of digital currency (bitcoin, litecoin, ether) on behalf of our users. Given the instant nature of digital currency and that it can't be revoked, we have one of the hardest payment fraud and security problems in the world. We are hit by the most sophisticated scammers constantly and consequently we are at the forefront of the fight against fraud. We've witnessed and solved loopholes exploited by fraudsters years ahead of the broader industry (e.g., vulnerabiliities in second-factor tokens delivered by SMS, phone porting attacks, loopholes in online identity verification, etc.). In this talk, I'll present examples of scammer trends and techniques we've seen through the past years. I'll also talk about our risk program that relies on rules-based systems, supervised and unsupervised machine learning as well as highly-skilled human fraud fighters.
Data Science in The Home Depot
The Home Depot is the leading retailer of home improvement, construction products and services with over eighty billion dollar revenue and billions of transactions across different channels. We will describe how data science has enabled our search and find, recommender and personalization. We will also cover some open challenges that we are facing in e-commerce.
Machine Learning: The Future is Now!
Tarry will discuss: Dynamic innovation: how innovation is increasingly becoming mainstream with the application of machine learning. Dynamic skills: why individuals/professionals should focus on machine learning, BMI, etc. Future organisational structure - why a dynamic machine learning operating model is required by startups, scaleups and enterprises to sustain structure.
Bringing Physical Documents into Your Dropbox
Dropbox, building on cloud file sync and share, is reinventing how teams collaborate around content with new products like Dropbox Paper, but much content lives in the physical world. Photos are the bridge from the physical to the cloud. Here, we describe how deep learning within our mobile doc scanner, photo classifier, and OCR is used to cross that bridge.
Data Products at Glassdoor
At Glassdoor, we have one of the most comprehensive data sets around jobs including reviews, salaries, job listings, and job relationships. Using these, we build a variety of data and ML products that empower users to make career decisions, and help employers attract candidates. Hear some of the lessons we've learned during this process.