If you are interested in one of the below projects, Semester Projects, Master Projects at EPFL, or Master Projects in Industry please contact the first person of reference indicated in each description either by telephone, or by email, or by visiting us directly to the LASA offices.
Compliance in reinforcement learning for countinous stateaction spaceLearning by demonstration provides us with a powerful framework to enable a robot to perform desirable tasks. However, realworld demonstrations are prone to noise and other uncertainties; especially, when a teacher (i.e., the person who provides the demonstrations) can only provide suboptimal solutions. While these noisy demonstrations can speed up the learning process at the beginning, it is favorable if our robot can go beyond this suboptimality and reach the optimal solution to perform the task. This requires a delicate balance between the exploitation of the noisy demonstrations and the exploration for the optimal solution. As human, one key element that enables us to exhibit this behavior naturally is our physical compliance. When we start to learn a new task we reproduce what we see or are told and thus stay compliant. Reaching a satisfactory confidence in performance, we start to reduce our compliance and search for small improvements. This concept has been tackled by previous students (https://www.dropbox.com/s/phd70gjouwy84b2/Report_Semester_Project.pdf?dl=0 , https://www.dropbox.com/s/pvg2vrrv8lg226f/main.pdf?dl=0). Their work was done in a discrete world and offered keen insight into compliance. The goal of this project is to extend the work into continuous stateaction space in order to bring it closer to actual implementation on a real robotic platform. This is a nontrivial step. We will look at algorithms such as the policy gradient algorithm and the actorcritic algorithm in order to find efficient ways by which to incorporate compliance in these algorithms. If time permits actual implementation on a robotic platform will be considered, starting with simple pointtopoint reaching. The student will need to familiarize themselves with reinforcement learning theory and practice as well as the concept of compliance developed in the lab. A direction will be proposed, but the student will be free to explore ideas which they deem interesting as long as they provide scientific justification for them and they are in line with the goal of the project.


Adaptive humanrobot interaction: From human intention to compliant robotic behaviorRobots are mainly here to assist us with tasks that are repetitive and burdensome. Machine learning and control theory provided us with a variety of techniques to teach our robots to perform such tasks. However, the ability of robots to adapt their tasks to their environment or to the intention of their humanuser is limited. Providing robots with such adaptive abilities will unlock new possibilities for assistive robotics. Consider polishing as a task for a robotic arm. The robot learns how to polish from human demonstrations. However, during polishing, the humanuser can safely grab the robot and change the polishing direction by applying few repetitions of movements in a new desirable direction. This means that the robots quickly adapts its motions to the intention of the human, thus, assisting him/her in performing the new task. Previously, as the first step, we proposed a method for adapting the robot’s behavior to the intention of a humanuser. This method is implemented/tested in simulation, and welldocumented here. For the next step, the student will implement this method on a real robot. We will be using 7DOF Kuka LWR 4+. An impedance controller will be provided to control the endeffector of the robot, and the student will mostly focus on adaptive motion planning using dynamical system. The method will be implemented in C++ using ROS libraries. At the end, we expect a compliant robot that polishes a surface and adapts its behavior (i.e., the location and the shape of the polishing) to the motions of the human.


Learning Manipulation with 4 Robotic ArmsMany industrial tasks require to have several robotic arms working on the same piece simultaneously. This is very difficult as we want the robots to perform the task while not intercepting each other. The joint workspace of the robots is highly nonconvex and cannot be expressed mathematically. This project will apply machine learning techniques to learn a representation of the feasible workspace of the 4 robotic arms. This representation will then be used in an inverse kinematic controller to control for the robot's motions at run time. The algorithm will be validated to control 4 robotic arms in the lab that must manipulate objects on a moving conveyer belt.


Sparse Solutions for LargeScale Regression ProblemsThe curse of dimensionality is one of the main challenges in 'Big Data' problems. Unless the learning algorithm has an explicitly imposed sparsity constraint, model complexity will undoubtedly increase with respect to the number of samples. Typical sparse solutions for regression focus on problems where the number of samples "M" is less than the input "P", i.e. "M P", specifically, datasets with > 100,000 samples and where a sparse solution is needed for efficient prediction. Two kernelbased methods exist that are formulated to tackle such problems: 1) Relevance Vector Machines, a Bayesian formulation of Support Vector Machines that applies the Bayesian ‘Automatic Relevance Determination’ (ARD) methodology to linear kernel models. 2) Sparse Gaussian Process with PseudoInputs, whose covariance is parameterized by the locations of "M" pseudoinput points, which we learn by a gradient based optimization (analogous to 'relevant vectors'). Nevertheless, both of these algorithms do not scale to data >100k training samples due to their optimization during training. Based on the literature of 1) and 2), the student should extend one of these algorithms to be capable of handling larger datasets, either by a) reformulating the optimization problem, such that it becomes feasible or b) tackle it with a divideandconquer approach and partition the large dataset into smaller subsets where 1) or 2) can be learned and merging/appropriate aggregation schemes must be introduced. The proposed approach will then be validated on interesting realworld dataset with M > 100k. The solution shall be implemented in Matlab/Python/C++ (the students choice).
[1] Michael Tipping. Relevance vector machine, October 14 2003. US Patent
6,633,857


Towards Incremental Learning: Merging SVMs from independent sample setsWith the increase in data available online and everchanging applications, incremental and online machine learning algorithms that can adapt, learn and unlearn will become essential in the near future. Support Vector Machines (SVM) are undoubtedly one of the most powerful machine learning algorithms to date, however, due to the nature of the posed optimization problem (batch learning), they fall short when applied to incremental/adaptive problems. In this work, we are interested in finding a suitable solution for the problem of "incomplete datasets" or "complementary datasets" for a classification problem. Assume we are given a dataset at a specific point in time and we must learn a model to start predicting immediately. Then, we are suddenly given a new set of samples which belong to the same dataset. The question now is: What do we do with this new data? Do we relearn the entire model with all the datapoints? What if the samples are contradictory? Can we learn a new decision function from the new samples and merge them to the old model, without hindering performance on classification? Can we incrementally update the old model with our new samples? What if we suddenly realize that some samples were labeled erroneously and we would like to 'unlearn' them? These are the [subset of] questions that the student should try to answer. Seldom work in SVM literature is capable of handling these issues. The few works that can, are categorized into 1) "active/online methods" where training points are fed onebyone and the SVM is learned sequentially [1] and 2) "ensemble methods" where a dataset is 'partitioned' into Nsets where NSVMs are learned and basic aggregation schemes are applied to generate a final machine [2]. These approaches, however, are mostly suitable for handling large datasets and focus primarily on improving training time (i.e. efficient learning). By leveraging ideas from 1), 2) and online convex optimization [3], the student must propose an efficient and adaptable SVM learning scheme capable of solving all [or a subset] of the issues imposed by the proposed incremental learning problem. The solution shall be implemented in Matlab/Python/C++ (the students choice).
[1] Antoine Bordes, Seyda Ertekin, Jason Weston and Léon Bottou: Fast Kernel Classifiers with Online and Active Learning, Journal of Machine Learning Research, 6:15791619, September 2005.


Robust Bimanual Reaching motion for ABBYumi RobotTo perform many of our daily tasks, we use our both arms (and hands). This fact allows us to have a better control over our environment (better perception, higher precision, higher degrees of actuations, and higher applied forces). Given the uncertainties in our surrounding environment, our ability to coordinate the motion of our arms is extraordinary. Endowing robots with the same ability would increase their performance in the interaction with uncertain environments. Imagine a scenario where the robotic task is to grasp an object with imprecise location (only a probability distribution is available). This imprecision can be due to noisy perceptions or the fact that the object is moving with unknown dynamics. In such conditions, taking the maximum likelihood for granted and performing the task in a deterministic fashion might lead to poor performances and even failures. However, the robot can perform exploratory motions to gain better knowledge about the environment (i.e., a probability distribution with higher confidence for the target) which in turn would increase the performance of the task. As the first step in this project, the student will focus on the formulation of a simple algorithm for simultaneous estimation and motion planning (using Kalman filters and dynamical system). As the second step, the student will implement and test this algorithm using ABB Yumi robot. The implementation will be done in C++ using ROS libraries where the robot is controlled in position. At the end, we expect a bimanual robot that grasps objects efficiently under environmental uncertainties.


Learning Manipulation with 4 Robotic ArmsMany industrial tasks require to have several robotic arms working on the same piece simultaneously. This is difficult as the robot should not intercept each other while performing the task. The joint workspace of the robot is highly nonconvex and cannot be expressed mathematically. This project will apply machine learning techniques to learn a representation of the feasible workspaces f 4 robotic arms. This representation will then be used in an inverse kinematic controller to control for the robot's motions at run time. The algorithm will be validated to control 4 robotic arm in the lab that must manipulate objects on a moving conveyer belt. It will also extend the approach to enable to manipulate the object under perurbations, such as when the conveyer belt slows down or accelerates rapidly.


Detection of product purchases from shelves in unconstrained, uncalibrated, heavily cluttered environmentsOur work requires us to track changes within shelves in retail environments, corresponding to shoppers picking up products and buying them or putting them back on the shelves. Our methodology involves analysing video from one or more cameras recording up to 16h of store activity.


Extraction of profiles of shopping and purchases patternsIn the scope of our shopper behaviour studies, we are confronted with the task of segmenting profiles of shoppers according to the type of purchases they make (categories of products, departments visited, …).

