Event Schedule

Schedule Template | CodyHouse
TBA

Automated Machine Learning

In recent years, machine learning applications have increasingly caught the attention of industry worldwide. Machine learning automation (autoML) comes to optimize existing processes in this area and make all stages of development more agile and productive. In this talk we will cover the challenges behind this automation and the trends for the future.

Level

Intermediate

Text Classification with Deep Learning

Review of the state-of-the-art machine learning models for text classification, plus a detailed case study of the most recent transformer-based models and transfer learning examples.

Level

Intermediate

Real-World Reinforcement Learning: challenges and opportunities

Reinforcement Learning has had impressive results in the last 5 years in very complex domains. However, they were mostly restricted to games (Atari, Go, Chess, DOTA, Starcraft), where the algorithms can have billions of interactions with the simulated world before performing well. We will explain why things are harder in the real world, where sample complexity, safety, robustness and other factors are very important. We will conclude by pointing to the current most promising research directions to bridge that gap.

Level

Beginner

Better than Deep Learning: Gradient Boosting Machines (GBM)

With all the hype about deep learning and "AI", it is not well publicized that for structured/tabular data widely encountered in business applications it is actually another machine learning algorithm, the gradient boosting machine (GBM) that most often achieves the highest accuracy in supervised learning/prediction tasks. In this talk we'll review some of the main open source GBM implementations such as xgboost, h2o, lightgbm, catboost, Spark MLlib (all of them available from R and Python) and we'll discuss some of their main features and characteristics (such as training speed, memory footprint, scalability to multiple CPU cores and in a distributed setting, prediction speed, characteristics of the GPU implementations etc). This will provide the audience with some guidance on which implementations are worth considering depending on their use case.

Level

Intermediate

Data Science in a traditional fashion retailer - Exploring Demand Forecasting

Demand forecasting is key for the retailing industry. This is especially true in the fashion industry where product demand is volatile and the life cycle is short. In this talk I will describe the Demand Forecasting Problem in the context of a brick-and-mortar retailer. I will discuss commonly used techniques, including evaluation metrics, feature and target transformations and commonly used predictors. Finally, I will conclude with a short discussion on the challenges of successfully implementing a demand forecasting solution beyond the technical details.

Level

Beginner

What can data scientists learn from journalism?

This talk is about how best practice in visualisation and communication from the data journalism field might be applicable to data scientists wanting to broaden the audience for their research.

Level

Intermediate

Information Visualization — what, when & how?

Information Visualization has emerged as a tool to assist in the understating and analysis of data. Through visualization, abstract data is simplified, structured and represented visually to uncover patterns of interest, often inserted in a temporal context. In this talk, we briefly present the field of Information Visualization, unveil how abstract values can be translated into the visual domain, and focus on the representation of time-series data.

Level

Beginner / Intermediate

Unraveling climate change through data visualization

Climate change is a complex, non-linear process of several intertwined elements along time and space. Mathematical models have helped significantly to understand all these processes. However these numerical models produce vast amounts of data that ideally should be communicated in an intuitive fashion for audiences within and beyond academic research. Data visualization tools, such as D3.js, contribute to make models and climate change more understandable. In this talk we will discuss the challenges and lessons from the experience of developing dataviz projects in Python and D3.js for the scientific community, covering features such as cartographies, trajectories, time series, patterns, budgets, uncertainties or clusters.

Level

Beginner / Intermediate

Scaling Data Science across a "traditional" Enterprise

During this talk, it will be described methods to scale Data Science across a "traditional" enterprise. Such methods will include end-to-end use-cases life cycle, from demand creation towards deployment, including frameworks for prioritisating use-cases, developing use-cases and path-to-production. Several technical components will be addressed during this session, from standard DevOps methods to build a Data Science Platform that scales on the cloud to standard docker images to accelerate and automate deployment of use-cases. Important to highlight that this talk will not go deep in Machine Learning methods to solve a particular problem but instead highlight how to enable many Machine learning libraries and framework to enable multiple use-cases to solved across the Enterprise.

Level

Intermediate / Advanced

AI in Healthcare: from imbalanced datasets to product development

AI in healthcare started when researchers who were specifically interested in deep learning, started sharing their results on their findings. Then came competitions where algorithms were beating human pathologists consistently on both accuracy and concurrency. Yet in real world, datasets do not come prepared and pre-processed. Tarry will share a few latest client & partner stories in ophthalmology, cytology and dental areas. He will discuss the challenges as well as the huge opportunity for data scientists and entrepreneurs who wish to make an impact with machine learning in healthcare.

Level

Intermediate

Cloud-Based Data Pipelines and Interfaces for Batch Data

When it comes to processing batch data, there are a plethora of technological options available from several different providers. Choosing from them can be a difficult task as most of the official literature around them is meant to sell you specific solutions rather than identify whether or not its right for your particular use case. This talk will focus first on recognizing a few common types of problems and solutions that exist in the batch data world. Then we will reify the problems and solutions with real-live use cases.

Level

Intermediate

Discrete Event Simulation: A practical example

Going through popular resources on Data Science, you might think that all predictive models are just variations of stateless functions that map X to ​Y. And "when you have a hammer, everything looks like a nail". But many processes of interest are a series of both sequential and parallel events, involving a number of agents and resources, and this is extremely difficult -- even silly -- to model using traditional methods. This is where Discrete Event Simulation (DES) comes to the rescue. In this talk, you will hear a short introduction on how it works, what tools you can use to implement it, and how we applied it to solve a nagging problem for one of our major clients.

Level

Beginner

What is really boosting AI?

The current success of AI, a decades-old scientific discipline, after successions of winters can be explained by many different factors, including available computing power and storage, pervasive digitalisation, the abundance of data, algorithmic maturity and popular awareness. In this talk, I will address these reasons and focus on one very important booster of AI success, the ability of learning systems to find their way of representing knowledge about the world. From deep learning and NLP to matrix factorization, the ability to process raw data has greatly increased. Unfortunately at the cost of transparency. What next?

Level

Intermediate

Applying research in stream processing for fraud detection

Stream Processing frameworks are abundant in our day and age. Have you ever felt confused, wondering why there are so many and what they are used for? This talk goes over what some of the most common stream processing technologies out there, what they are used for and contextualizing their purpose for fraud detection. Finally, we'll go over some of the research work that my team and I been involved in with regards to stream processing and where we are thinking of going next.

Level

Intermediate

Automated Machine Learning using Auto-sklearn

In this workshop we are going to cover some ideas and general techniques behind automated machine learning like feature discovery, prepossessing, model selection, hyper-parameter optimizations and ensembling.

Requirements

Familiar with python and sklearn modeling.

Level

Advanced

R4Journalists: Introduction to data journalism

Everyday, journalists across newsrooms fail to get a new story because they lack the knowledge to deal with data. If you are interested on how data can revolutionize your reporting skills, this workshop is for you.

Using R, a statistical computing and graphics language that can easily get you started on data analysis, we will go across the very basic steps of reporting using data to get you started on the ddj world.

In the end, you should get out of this workshop with the basic knowledge of R language and how to keep developing your data journalism skills.

Requirements

To participate, you should bring your laptop with R and RStudio already installed. Instructions on how to do it will be provided.

Level

Beginner

An introduction to Deep Reinforcement Learning

Brief review of Reinforcement Learning (RL) concepts (policies, state-value functions, action-value functions, Q-learning algorithm). Intuitions on why doing RL with value function approximation (e.g. neural networks) on large state spaces is hard. Explanation of the DQN algorithm (Mnih 2013), that was used to play Atari games at human level performance. Implementation of code snippets of DQN with support of a Jupyter notebook with pre-existing Python code skeletons. Pointers to more recent research papers on extensions to DQN (prioritized replay, double DQN, etc.)

Requirements

Some familiarity with Machine Learning (e.g. supervised learning, neural networks) is helpful to understand the concepts. Some familiarity with the Python programming language and either Tensorflow or PyTorch will be needed for writing code snippets.

Level

Advanced

Tell a story with a map: introduction to information visualization for communication

Information visualization is a very powerful tool when it comes to communication. As an English language adage says – “A picture is worth a thousand words”. Visualization uses one of the most powerful human senses, which is vision, to convey information. Combined with storytelling techniques, visualization can communicate complex information more effectively. A well designed visualization also triggers emotions on the target public, which affects its memorability. In this workshop we will learn how to design an effective visualization, which conveys information using storytelling. We will start with theoretical background for information visualization, including perception of visual information and visual mapping. Additionally, basic notions of storytelling will be given. In the second part of the workshop, we will build an interactive map using JavaScript and D3.js.

Requirements

Intermediate to advanced knowledge of JavaScript, basic knowledge of HTML, CSS, SVG and jQuery. Tools: code editor, local web server.

Level

Intermediate

DS@Academy

DS@Academy is an activity that will occur during the October 25 afternoon. If you are a student or concluded your studies in the last year, don’t miss this exciting opportunity to present on stage, that exciting work in which you have been working for countless hours! It’s your opportunity to shine, show your skills and share with an audience of data scientists and related professionals, companies and fellow students, the latest breaking news and your work can change the world! If you have an application case, best practice, new ideas, algorithm, a great solution for a practical problem or other (as long as it is related to the field) make your submission in this form. The 12 selected works will win a free pass for the conference. Don’t be shy, go a step further and enter this challenge! We’re eager to hear your project 😊

DS@Work

This activity will occur during the October 25 afternoon and is the perfect opportunity to meet and know more about our sponsors. During approximately 2 hours you will have the chance to hear the best things that they are doing and details about the most exciting projects in data science and machine learning, by their collaborators. You will also have the chance to engage with the companies representatives and ask the questions that you never had the chance before! Don’t miss this opportunity and join us!! 😉

Mind Blowing

One of the biggest news for this edition! Mind Blowing Learning, powered by INESC TEC, is one of our newest proposals, following the schema of lightning and ignite talks, prepare yourself to hear data scientists and machine learning experts to talking about breakthrough discoveries, concepts, buzzling ideas and much more in a non stop presentation. We want to trigger in our audience the spark of curiosity for what is the most advanced work in the field and/or who knows the spark for your next one million dollar idea! This activity will happen in the October 26 during the coffee breaks, near the entrance of the main stage for about 30 minutes (don’t worry, you will still have plenty of time to eat and networking). Don’t miss this great opportunity, grab a beer and join us! PS- did we already tell you that, this activity is powered by INESC TEC?! 😃 Proudly one of DSPT Day partners!

Meet a Sponsor / Meet a Speaker

Your “hero” or that person that you always wanted to meet is going to give a talk or keynote at DSPT Day ? Then this activity is for you! As in the previous edition, this year, during the October, 26 you have the opportunity to book a time slot to meet and have a chat with a speaker. No more hassle or fighting for attention! The time slots are assigned only to a person at a time and it’s your job to make the most of it! Don’t miss the chance to book your time slot with the speaker that you are eager to discuss exciting topics! Who knows if the spark for your next big idea is not only a conversation away!

Like in “Meet a Speaker”, we also want to give you the opportunity to chat in a more casual and relaxed way with our sponsors. If you are interested in knowing more about a company, book a time slot and we will ensure that you have full attention from a representative, either technical or HR! Feel free to make the best use of this opportunity, because it’s all about the networking!

** Only for workshop attendees