• Typical algorithms for solving reinforcement learning (RL) problems, are built on an assumption of a stationary environment (modeled as a stationary MDP), meaning the agent is learning how to act in an environment in which the action chosen in each state is not time dependent. However, one can think of many everyday life problems that occur in non-stationary environments, which change over time. Such problems were discussed in former articles...
    Categories:
  • The RL framework requires to define reward function, which generates scalar reward per action, which can be a hard task. Therefore, many RL environments are described with a ‘Sparse reward’, in which most of the agent actions would receive no reward, except for an action that would lead the agent to the final goal. A lot of RL algorithms can have difficulties getting to a good result in those kinds...
    Categories:
  • The project’s goal is to build a framework that allows adding object tracking capabilities to an existing robot. The project focused on adding those capabilities specifically to the Hexapod robot. The Hexapod robot was built in CRML lab as part of a project by previous students. In addition, the framework will allow future projects to acquire object tracking capabilities more easily for the purpose of further developments in CRML lab....
    Categories:
  • REPLAB is a reproducible and self-contained hardware stack (robot arm, camera, and workspace), that aims to drive wide participation by lowering the barrier to entry into robotics and to enable easy scaling to many robots. In this project we used REPLAB’s stack to grasp wood blocks placed on a tray and added new infrastructure for block classification by size. This new infrastructure includes both analytical, and AI based classifiers for...
    Categories:
  • Graphs are gaining increasing popularity in research and the industry due to their high representational power, which uses the interactions between different objects in the data. Different tasks can take advantage of the graph’s unique properties. Specifically, Graph Neural Networks (GNNs) are Deep Learning models where the nodes represent the neurons and propagate the data via the edges. By doing so, our forward and backward paths are both directly affected...
    Categories:
  • In most of the machine learning problems related to robotics and computer games, the algorithm learns how to perform a task with the visual representation of the state as an input, given to the RL network too choose an action for each state. A question which arises is whether the agent can get the input data in a different representation, which will yield faster running times and better average scores....
    Categories:
  • Error correcting output codes (ECOC) are commonly used to reduce multiclass classification tasks to multiple binary classification subproblems. In ECOC, classes are represented by the rows of a binary matrix, corresponding to codewords in a codebook. Commonly, given a codebook, codewords are implicitly assigned to classes arbitrarily. In this paper, we show that the traditionally-overlooked codeword-to-class assignments play a major role in the performance of ECOC. We demonstrate that assigning...
    Categories:
  • Average internet user now spends 6 hours and 40 minutes a day using the internet. While there are many apps dealing with harmful content, each user has his own personal preferences and definition for unwanted content. Project Goal • Build an application that will allow internet users to filter-out unwanted images based on their preference, in a simple and comfortable way. Transfer learning is a method where a model that...
    Categories:
  • Reinforcement learning (RL) is a popular method for solving problems involving decision tasks in which the agent has only limited  environmental  feedback.  On  top  of  RL,  curriculum learning aims to shorten the training process by training the agent on ascending difficulty environments in order to solve tasks that otherwise can’t be taught. Despite being an acceleration method, choosing the sequence of the environments is atrial and error procedure. In order...
    Categories:
  • Telemedicine or Telehealth is the distribution of health-related services and information via electronic information and telecommunication technologies. It allows long-distance patient and clinician contact, care, advice, reminders, education, intervention, monitoring, and remote admissions. In this project, we aim to build an easy to use, yet intelligent Robot assistant that will serve as an RC gateway between long-distance patients and their families, doctors, loved ones providing the patient with care, monitoring,...
    Categories:
  • Our Goal: Implement the Action Robust RL framework within the OpenAIbaselines codebase and evaluate the approach across multiple algorithms. The process: In order to deal with the research we wanted to do we started the project by learning the content relevant to the field. To this end, we first saw the RL Course lectures by David Silver. After learning the required areas in RL we read several articles including our...
    Categories:
  • Main project title: “Modeling physical systems using machine learning” Project: “Accelerating wave propagation simulations using machine learning” Abstract: In order to describe complicated phenomena in the physical world, we use numerical simulations instead of real-world experiments. There are two main disadvantages to numerical simulations: The need for large computing power, long run time for the need to satisfy numerical stability conditions. In order to overcome these disadvantages we looked at...
    Categories:
  • Planning is branch in artificial intelligence that tries to create general solvers that can solve any problem described by the set:  . Planners try to solve the different problems by deploying a graph in which nodes represent the problem states and conducting forward search to find the optimal path from the initial state to the goal state. The number of states in a problem is exponential in the number of...
    Categories:
  • Shoppers rely on Home Depot’s product authority to find and buy the latest products and to get timely solutions to their home improvement needs. From installing a new ceiling fan to remodeling an entire kitchen, with the click of a mouse or tap of the screen, customers expect the correct results to their queries – quickly. Speed, accuracy and delivering a frictionless customer experience are essential.   In this project,...
    Categories:
  • What is Natural Language Processing? Natural language processing (NLP) is a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to process and analyze large amounts of natural language data. The result is a computer capable of “understanding” the contents of documents, including the contextual nuances of the language within them. The technology can then accurately extract information and insights contained in the...
    Categories:
  • This project seeks to offer a new solution to Operating System Page Handling, the subject of page handling has been around since the dawn of computing. This project attempts to bring the technology into the modern age using Machine Learning. The Project outlines the problems facing current cache paging algorithms, and why the necessity for a modern approach is a must. There are several bottlenecks when it comes to computing,...
    Categories:
  • Our project examines a new deep exploration method in reinforcement learning. Reinforcement learning is an area of machine learning. It is about taking suitable action to maximize reward in a particular situation. There are two factors in the learning process: the agent and the environment. In reinforcement learning, the agent observes the current environment state, takes an action and gets a reward from the environment. The agent learns by observing...
    Categories:
  • Online learning is a general framework for sequential decision-making under uncertainty. On each round, the learner performs an action and observes some feedback from the environment. We apply the Online Learning framework to the context of stock exchange, what is called online portfolio selection. In this setting, at each round the learner chooses an allocation vector, named portfolio, which specifies how the money should be invested. Hence, the feedback is...
    Categories:
  • REPLAB The project’s goal is to replicate a standard, reproducible, and cheap environment for benchmarking robotic arm object grasping algorithms. The original environment was created in Berkley and reviewed in the article: “REPLAB: A Reproducible Low-Cost Arm Benchmark Platform for Robotic Learning” by Brian Yang, Jesse Zhang, Vitchyr Pong, Sergey Levine, and Dinesh Jayaraman. The standard REPLAB cell includes an arena with a wooden base, a cage comprised of metal...
    Categories:
  • The PID control method is used extensively for controlling UAVS and other systems. Usually, the controller design and tuning process assumes a linear system with known dynamics, making it vulnerable to high non-linear changes, such as variations in load and environment uncertainties. In this project, we will explore a controller implementation using neural networks, subject to small non-linearities. We will first build a simulator for a non-linear system of a...
    Categories:
  • Object detection is a Computer Vision task which requires to find the locations of all objects in a given image and classify the objects. Deep Neural Networks (DNNs) for object detection are in continuous development and change rapidly, with several different approaches for constructing DNNs that are suitable for the task. In this project we will first examine several approaches to DNNs for object detection and then build a system that...
    Categories:
  • Epileptic Seizure Prediction Based on EEG Signals and Machine Learning
    Epilepsy is a neurological disease that affects about 1-2% of the human population. Epileptic seizures happen spontaneously without warning, caused by abnormal brain activity. Epileptic patients suffer from frequent injuries and death due to the unpredictability and violent nature of the seizures. In this project, we recreated an algorithm of seizure prediction. The algorithm we recreated is presented in a certain paper. The main difference between the two algorithms is...
    Categories:
  • Twitter is known as a platform with great influence on public opinion. Many companies operate fake twitter users (or “handles”) with the sole purpose of influencing public opinion. Such handles are tagged as “Trolls”. In order for Twitter to not block the troll handles, the operating companies of such handles try very hard to disguise them as regular handles. In this project we had built a machine learning system which...
    Categories:
  • Following Bitcoin’s popularity in recent years, use of cryptocurrency have been growing rapidly and it’s becoming more necessary to develop systems that can analyze its behavior and try to predict future prices. In this project we implemented this by using several machine learning algorithms to predict prices after an hour, a day or a week and compared their performance and results.
    Categories:
  • The goal of the project is to classify “Amazon” reviews based on “helpful/unhelpful” votes, i.e. given a review, to predict if it is “helpful” or not. The classification process in based on the number of “helpful/unhelpful” votes for each review in our dataset that contains hundreds of thousands of reviews. After preprocessing of the data, we will compare between different Machine Learning models, some of them are relatively naïve and...
    Categories:
  • One of the greatest long-lasting issues in the field of reinforcement learning is the ability to increase a model’s performance and training it on offline, partially observable data. This problem is most obvious in fields like autonomous cars and medical procedures, since access to a simulator is expensive- as it can jeopardize people’s lives, while at the same time massive amounts of data from drivers and doctors exists and is...
    Categories:
  • This project is based on the article: Watching and Acting Together: Concurrent Plan Recognition and Adaptation for Human-Robot Teams by Steven J. Levine and Brian C. Williams (2018) from MIT Computer Science and Artificial Intelligence Laboratory.   In recent years, the field of AI (Artificial Intelligence) and especially the use of robots, has been developing rapidly, among other things, for the purpose of performing work together with humans in real...
    Categories:
  • Coordinating the movement of multiple trains is a hard planning task. The computational complexity of solving such a problem is exponential in the number of trains in the general case. The project deals with solving the Flatland domain, which simulates the train coordination problem. It is done by arranging the train by priority, and planning the optimal route for each train, given the routes for the other trains. If a...
    Categories:
  • Today there is a complicated problem in the world of aviation – finding an optimal flight path while considering various constraints such as: defined routes time-limited flight zones different altitudes arrival time wind, fuel and more. Organizations are willing to invest a lot of resources in order to solve this problem. In our project we will offer implementation to solve this problem while considering those constraints and using information that...
    Categories:
  • There are many ways to solve an RL problem- we have focused on a model free algorithm. This approach has no assumption required regarding the environment. This inclusion allows us to use the same algorithm to solve different games. “A more sophisticated sampling strategy might emphasize transitions from which we can learn the most” playing atari with deep reinforcement learning Improve Q-Learning by changing the replay buffer to choose between...
    Categories:
  • In this project, we consider the problem of RL in continuous state spaces with sparse reward. In continuous state spaces tabular methods cannot be applied directly. Thus, approximation approaches, such as function approximation methods, are used. These approaches often lack in exploration compared to classical tabular RL algorithms and thus perform badly in sparse reward domains. A possible solution is by performing a discretization of the space. We explored different...
    Categories:
  • Safety and reliability are highly important in real-world Reinforcement Learning (RL) systems – in particular in risk-intolerant applications such as autonomous driving and medical devices. A statistical test has been recently suggested to detect whenever the performance of the RL agent deteriorates. However, this test has several limitations: It measures the agent’s rewards, but ignores other information that is usually available in RL problems. For example, when an autonomous car...
    Categories:
  • Binary classification is a popular Machine Learning (ML) task where we wish to classify an input instance into one class out of 2 possible classes. For instance, given an image of an animal, predict whether it is a dog or a cat. In multi-class classification we wish to classify an instance into one class from a set of many possible classes (> 2). For example, given an image of an...
    Categories:
  • In modern times, the demand for accurate weather predictions is on the rise. In professional windsurfing, prior knowledge on the wind’s speed and direction is used for planning the optimal route at sea. If the wind’s direction is static, the surfer can sail vertically to the wind, and then use the wind as a boost to the target. In real life, the wind shifts all the time, and so dynamic...
    Categories:
  • In a high-rise building, with big number of elevators, there are many ways to move passengers from their source floor to their destination (like moving different elevators to each floor, and deciding which elevator each passenger will take). Our project should learn an algorithm that chooses, for each state, the best action for all the elevators. The chosen action is the action that will probably give the highest reward for...
    Categories:
  • The phenomenon of making images using only Rubik’s cubes is a trending art over the internet. In this project, you would implement a solution that take any image as input, crop, rescale and quantize the colors, and use RL based algorithm in order to solve every cube to the desired solution that will fit into the image. For the cubes art see: https://www.thejakartapost.com/life/2020/02/04/rubiks-cube-mona-lisa-goes-on-sale-in-paris.html For the RL based solution you can...
    Categories:
  • Mortgage applications can involve difficult decisions, for both the financial institute and for the client. The lender takes a big risk – approving mortgages for lenders who may not be able to pay them off in the future. Errors that banks have done in approving mortgages too easily have led to financial crises in the past, which emphasizes the importance of cautious decision making. The client also takes a major...
    Categories:
  • In this project we want to solve blocks-world problem. We have a table with four blocks in different colors and five spots where we can place the blocks (four of them are already taken in the given state). To this state we’ll call the initial state. Our goal is, with autonomous agent, to solve the problem. Meaning, change the blocks’ order to an order given by the user in the...
    Categories:
  • In this work, we use the LSTM version of Recurrent Neural Networks, to predict the price of Bitcoin. We describe the dataset, which is comprised of data from an API to Coinmarketcap and how its preprocessed. Further on, we show the usage of LSTM architecture with the aforementioned time series. To conclude, we outline the results of predicting Bitcoin price of the next day and display several statistical analysis.
    Categories:
  • Monitoring the number of people in a room is important for a number of reasons, especially in these days of dealing with the corona virus and the need to limit gatherings. In addition, it is necessary to maintain the privacy of the people in the room, so we would like to avoid using photographic means for the purpose of surveillance and counting. Our solution to these requirements is to use...
    Categories:
  • Mastering Blackjack is a get-into Reinforcement Learning project. Reinforcement Learning is a branch of machine learning where an agent preforms in an environment and gets rewarded for his action. The agent goal is to learn a behavior policy that will maximize his reward over time. In our project we studied the game of Blackjack, introduced RL algorithms such as Monte-Carlo Learning, Q – Learning and DQN, tested their performance with...
    Categories:
  • The Sudoku game is a number placement puzzle that gained popularity in newspapers and among the public in the last few years. In parallel, machine learning algorithms have gained massive popularity also. One way to solve Sudoku automatically is with brute-force algorithm, and trying every possible solution using a backtracking algorithm. The problem with this method is that it is very time consuming and has high complexity. In this project...
    Categories:
  • Graph Neural Networks (GNNs) have recently become the de facto standard for modeling relational data and gain popularity in recent years. Treditional activision functions (like ReLU and TanH) with vanilla MLP failed to encode continues representation of data. signals Periodic Activation Functions like, sin and cosin functions, hold better frequencies representation due to smoothness of the derivatives that calculated in the optimization process (the Jacobian and Hessian matrices). In this...
    Categories:
  • In this project we were to deliver a software infrastructure for robot motion planning. We were to implement it using the open source code library Open Motion Planning Library in python, and for Ubuntu 18 operating system. In the future the infrastructure could easily be used in more complex problems of robot Motion Planning in multi space environment with obstacles. In order to present a properly working infrastructure we were...
    Categories:
  • In this project we will examine a complementary approach to Hierarchical Reinforcement Learning, namely using context free grammars. This approach allows the agent to commit to specific temporal structures specified by a formal language over actions. By using this method, we expect to improve the performance and sample efficiency of the learning process of the agent. It will provide the ability to impose safety constraints on an agent. Furthermore, the...
    Categories:
  • Planning is an important human ability, starting at a young age. To perform this task, the brain must synchronize and combine several cognitive systems, such as controlling and processing the resulting data. In this project, we will implement an algorithm that performs efficient planning in order to find an order of the takeoffs and landings of aircraft. This is when the main aspect is finding the optimal order which will...
    Categories:
  • A critical challenge in goal-conditioned reinforcement learning (GCRL) is exploration; unlike “classical” RL, in the GCRL settings, every goal defines a different set of successful trajectories. Thus, roughly speaking, the exploration problem in GCRL also depend on the goal state as well. A common approach to tackle this exploration problem is to imagine goals in hindsight, meaning given (a possibly failed) trajectory, the agent tries to answer: “which goals does...
    Categories:
  • Utilizing off-policy offline data is still a hard problem for reinforcement learning agents. In this project we will investigate methods of making unbiased use of very large datasets through experience replays. Our agent will simultaneously learn how to efficiently sample helpful transitions while learning to gather new experience.
    Categories:
  • In this project, we simulate the use of a drone to navigate autonomously in a construction area while analyzing workers’ safety. The drone uses offline planning to plan its route over the area, takes pictures and evaluates, using a pre-trained CNN, whether there is a worker in the images and if so whether the worker has a helmet (works safely). This simulation is a proof that integration between all the...
    Categories:
  • The main goal of this project is to make comparison between different learning algorithms. The chosen algorithms were Boosting and Deep Learning. Films information dataset was downloaded from movie box office prediction Kaggle challenge. Features enrichment have been done on the mentioned dataset, most of them are simple mathematical and programmable, in order to get more features over the data which can be easy manipulated.
    Categories:
  • Future market forecasting techniques may be classified into two major categories: (1) fundamental analysis and (2) technical analysis. Fundamental analysis is based on macro-economic data such as Purchasing Power Parity (PPP), Gross Domestic Product (GDP), Balance of Payments (BOP), Purchasing Manager Index (PMI), Central Bank outcomes, etc. On the other hand, technical analysis focuses on past data and potential repeated patterns within those data. The major point here is that...
    Categories:
  • Deep Reinforcement Learning (DRL) is a thriving field in recent years as many breakthroughs were achieved in video games played by AI. Most notable is the Deep Q-Learning Network (DQN) architecture by DeepMind (acquired by Google) that solved many Atari games. In this project we will build on a recent paper “Shallow Updates for Deep Reinforcement Learning” (written by researchers from our faculty) which combines deep learning methods with “shallow”...
    Categories:
  • Reinforcement learning achieved significant results that received publication and influence, systems that were based on RL in a variety of games such as StarCraft, GO and poker(games that until recently were considered unsolvable due to it enormous number of states) reached results that surpass human performance. The most two important elements of RL are Action and State, embedding the current state of a RL environment can help reduce the dimension...
    Categories:
  • Airbnb presents listings of properties which includes detailed data and rent price, thereby it can be assumed that there is a reasonable connection between the property details and the rental price asked. This assumption can be interpreted to mathematical and statistical models, that can be learned by learning algorithms. Machine learning is the scientific study of algorithms that build a mathematical model based on sample data, known as “training data”,...
    Categories:
  • U-Net is a state-of-the-art, wildly used deep convolutional neural network for image segmentation. It consists of encoder-decoder style architecture and skip-connections between them to localize high resolution features. In this project we examined a possible improvement to the architecture by adding unsupervised learning method (auto-encoder) in U-Net architecture variant (U-Net with ResNet-34 as encoder). We used the data and evaluation system from a Kaggle competition – “Understanding Clouds from Satellite...
    Categories:
  • Project Goal: Use tweets database, stocks data and machine learning algorithms to predict future stocks movement. Our solution: We achieved our solution by 6 major steps 1. Build a database of tweets using Tweeter developer’s API – We collected ~150k Tweets from 15.12.20 to 5.3.20 using a daily run python script – The tweets were associated with 19 major companies using relevant keywords – The keywords were categorized as “Safe”...
    Categories:
  • The project deals with the forecasting of cryptocurrency on a small exchange by analyzing the same behavior on a larger exchange. The main requirement of the project was to create a platform that would fulfill the desired prediction and, in addition, enable the trading of cryptocurrencies and interface with the exchanges. In order to produce such a platform, we started with a simpler task and created a platform that interfaces...
    Categories:
  • Reinforcement Learning is a branch of machine learning where an agent preforms in an environment and gets rewarded for his action. The agent goal is to learn a behavior policy that will maximize his reward over time. Some of the research done in this field, is about expending RL to multi agent settings. The goal of this research is to solve problems where many agents preforms in the same environment,...
    Categories:
  • Image denoising is an important image processing task that experienced significant advancement, made possible by the progress in the development of convolutional neural networks and deep learning techniques. These techniques are based on learning clean images features during a training process over large datasets of clean images. In the paper Deep Image Prior it has been established that convolutional neural networks are inherently good at generating natural looking images and...
    Categories:
  • Colonoscopy has become very common in recent decades and given the fact that all referrals are examined manually by the doctors, the process of colonoscopy preparation recommendation is inefficient. Moreover, it is very difficult for an endoscopic gastroenterologist to become proficient in all available clinical guidelines and all new drugs in order to prepper the patient in the best way. In this project, an automation system as a solution is...
    Categories:
  • Meta-RL is meta-learning on reinforcement learning tasks. After trained over a distribution of tasks, the agent is able to solve a new task by developing a new RL algorithm. Train: Throw a bunch of problems with the same core structure to the model. Test: the meta-RL agent will be able to rapidly identify the key parameters of any new problem, eventually achieving optimal performance on this new problem
    Categories:
  • In this project, we propose an advanced approach in off-policy learning, where we suggest better and more sophisticated handling of the data the agent acquires to improve its performance. We apply those methods directly to Google’s novel framework- Dopamine, which trains agents employing a variety of methods to play Atari 2600 games as in Mnih et al. (2013). Since the agent acquires data samples sequentially, they are usually highly correlated....
    Categories:
  • PROJECT’S GOAL Using Reinforcement Learning to perform social laws learning in multi agents environment. These laws will enable the autonomous cars to cross intersections without accidents. UNITY AND ML-AGENTS We used Unity to model our environment and to rum simulations. In addition, we used ml-agents tool which trains intelligent agents with Reinforcement Learning via a simple method API. The ml-agnets connects the simulation environment to the learning algorithm
    Categories:
  • The goal of this project is to test a new continuous reinforcement learning algorithm in several different simulated environments. The algorithm, SIMPLE, uses a neural net to simulate the reward function for the environment’s state, similarly to the DDPG algorithm. The project’s contribution comes from its unique approach to finding an optimal action in a continuous action space. Using the neural net as a model for the reward function, it...
    Categories:
  • These days, there is an increasing use of robots in order to preform different tasks. In some cases, the robot needs to be able to know its location independently in order to carry out its work. Depending on the task, there are solutions for this problem, for example GPS can be used in places were coverage exists. The localization problem becomes harder when working in an area with low GPS...
    Categories:
  • Knowledge distillation (KD) showed significant improvement in supervised learning in the last couple of years and became the facto part of Computer vision frameworks. In this project you will learn about KD methods, experiment with KD in practice and try to learn an adaptive pace of teaching (based on similarities, uncertainty and more) This project has a research potential.
    Categories:
  • In this project, we will examine replacing rectified linear unit (RELU) activations with hysteresis activation functions. The idea is that for some tasks, Batch Normalization layers would learn that some of the activations (the intermediate features) need to be smaller, where RELU trim the negative part of the network. In that case, we might want to keep track of gradients of values that are close to zero, yet negative. In...
    Categories:
  • ממשק מח-מחשב הינה טכנולוגיה הנמצאת במרכז תשומת הלב של קבוצות מחקר רבות ושל חברות ענק ברחבי העולם. ממשק ישיר בין המח למחשב, יאפשר לבעלי מוגבלויות לתקשר עם הסביבה, לדוגמא: הפעלת מכשירים כמו כיסא גלגלים, מכשור בית חכם, גלישה באינטרנט ובמקרים מסוימים יהווה עבורם אפשרות יחידה לתקשר. גלי המוח אינם סטציונרים ולכן הדיוק של מודל קבוע שפועל על בסיס הפיצ’רים הטובים ביותר בזמן האימון דועך עם הזמן. ההתמודדות עם האתגר נעשית...
    Categories:
  • ML is very powerful for doing identification/localization/segmentation, but is usually applied to more structrued data (e.g., images). Hardware reverse engineering (HRE) = understanding the operation/internal structure of a circuit from external measurements Project goals: Build an infrastructure for comparing different ML approaches to HRE. Reproduce the results of [1]. Explore off-the-shelf modern ML techniques for HRE.
    Categories:
  • In many technologies today, sensors are a significant part of the set of components. Our project deals with distance or location sensors. As with all other sensors, there is also an error here in them. This error – can be split into two – a fixed offset error – which can be found by a set of measurements and then calibrated by rebooting with the offset error balance. And another...
    Categories:
  • Lung cancer stage classification using healthy tissue and tumor transcriptome In cooperation with Ori Mayer, MD PhD student, Prof. Noam Shomron’s lab, Sackler Faculty of Medicine, Tel-Aviv University Gene expression profile is available and affordable due to recent advancement in sequencing technologies. Sequencing generates a vast amount of data. Doctors cannot process and derive meaningful insights manually. The goal of this project is to build a model which predicts tumor...
    Categories:
  • In this project, we aim to devise a new scheme of adveserial learning which pertrube the input to hidden layers, and not solely the input layer. We will consider either training the network with attacks starting with pertrubing the upper layers where it may be easier for the network to generalize and continuing to lower layers. Another approach will add pertubation in each layer (e.g. by making use of the...
    Categories:
  • Accurate forecasts of wind speed and direction are crucial to Olympic sailors to plan the optimal sailing path and win a competition. Today’s forecasts are based on numeric solutions of complex mathematical models which are inaccurate. This inaccuracy might cause the sailor to plan a sub-optimal path and lose an Olympic medal. In this project we try to build a more accurate model based on machine learning methods, in order...
    Categories:
  • In this project, we demonstrate a visualization methodology and use it to analyze the learning process of A2C agents trained on the game Atari Breakout (Atari2600). We present examples for using our method to explore how A2C agents learn to carve side-tunnels, avoid states of losing a game and other useful skill.   Our method uses a modified version of the popular nonlinear dimensionality reduction algorithm t-SNE. The modification includes...
    Categories:
  • Recent advances in Reinforcement Learning have highlighted the difficulties in learning within complex high dimensional domains. We argue that one of the main reasons that current approaches do not perform well, is that the information is represented sub-optimally. A natural way to describe what we observe, is through natural language. In this paper, we implement a natural language state representation to learn and complete tasks. Our experiments suggest that natural...
    Categories:
  • Abstract In this project we’ve implemented a music-generating neural network based on Variational Autoencoder (VAE) architecture. The network was trained on a dataset with 500 MIDI jazz files, which were represented as piano rolls. Model Data Representation Each Midi file is converted to a piano-roll: a matrix representation of music with time and pitch axes. Then, the piano-roll is divided to slices with the same size. The slices are paired...
    Categories:
  • Hyperparameteroptimizationisbothapracticalissueandaninterestingtheoreticalproblemintraining of deep architectures. Despite many recent advances the most commonly used methods almost universally involve training multiple and decoupled copies of the model, in effect sampling the hyperparameter space. We show that at negligable additional computational cost, results can be improved by sampling paths instead of points in hyperparameter space. To this end we interpret hyperparameters as controlling the level of correlated noise in the training, which can be...
    Categories:
  • We propose a DQN based technique to generalize the concept of skills. In previous papers, we have seen attempts to hand-craft skills (Chen Tessler, 2016). The idea of skills is that skills reduce the effective size of the MDP by enabling the agent to plan at a higher level. More than that, every specific task will be performed in the best way. However, skills are specific for every domain. In...
    Categories:
  • Automatically generating images according to natural language descriptions is a fundamental problem in many applications such as computer-aided design, photo editing, art generation, and video games. AttnGAN is a neural architecture that combines the attention mechanism from the field of natural language processing, and the generative adversarial networks from the field of image processing. In this project, we implemented the architecture, suggested by the original researchers, trained the network on...
    Categories:
  • מפעלי תעשיה עכשוויים מכילים עשרות ומאות רובוטים, לעיתים בעלי ערך ניכר. איתור תקלות שנעשה בזמן נכון יכול לחסוך עשרות אם לא מאות אלפי דולרים ואפילו פגיעה בנפש. כמות הנתונים שנשמרים במאגרי מידע לעיתים מעמיסה חישובים ומקשה משמעותית ליכולת לעשות קלסיפיקציה בין סטים שונים של נתונים. לכן נושא דילול המידע בכלל והורדת הממד של נתונים בפרט מתחילה להיות בעיה רצינית שדורשת פתרון מושכל. במסגרת פרויקט I יבחנו שני אלגוריתמים להורדת הממד:...
    Categories:
  • Robots are becoming increasingly common in modern industries, performing diverse tasks in areas such as manufacturing, e-commerce and even medicine and healthcare. However, robots are powerful only when the task is repetitive and well defined – generalized robotic manipulation is still a very complex operation, since in many realistic cases object shapes and poses are unknown in advance. This is challenging even to powerful recent methods such as reinforcement learning,...
    Categories:
  • ממשק מח-מחשב הינה טכנולוגיה הנמצאת במרכז תשומת הלב של קבוצות מחקר רבות ושל חברות ענק ברחבי העולם. ממשק ישיר בין המח למחשב, יאפשר לבעלי מוגבלויות להפעיל מכשירים כמו כיסא גלגלים, מכשור בית חכם, גלישה באינטרנט ובמקרים מסוימים יהווה עבורם אפשרות יחידה לתקשר. גלי המוח אינם סטציונרים ולכן הדיוק של מודל קבוע שפועל על בסיס הפיצ’רים הטובים ביותר בזמן האימון דועך עם הזמן. ההתמודדות עם האתגר נעשית במספר שיטות, לדוגמא: התאמה...
    Categories:
  • בשנים האחרונות, הולך ומתרחב השימוש באלגוריתמי אופטימיזציה אדפטיביים עבור בעיות למידה. אלגוריתמים אלו משנים את גודל הצעד תוך כדי תהליך האימון (on-the-fly) בהתאם לנתונים הנצפים (observed data). היעילות והאפקטיביות של אלגוריתמים אלו הביאו לכך שהם מהווים כיום את ברירת המחדל באימון רשתות עמוקות, כאשר Adam הוא הנפוץ שבהם. בפרויקט זה נתעניין באלגוריתם אדפטיבי אחר שהוצע לאחרונה – Stochastic Polyak Step-size. האלגוריתם נהנה מהבטחות תיאורטיות על קצב ההתכנסות והוא בעל ביצועים...
    Categories:
  • In this project we implemented a trading agent for cryptographic currencies, which uses the currencies price and volume history, as well as published posts and comments on Reddit forums. Its target is to change the portfolio appropriately, in order to maximize the profit. In the project we rely on the assumption that a correlation exists between the currency’s price and the textual publications about it on the different forums. To...
    Categories:
  • We address the problem of solving high dimensional MDPs using its minimization by sampling and solving a navigation problem on a graph using RMAX. Many previous attempts optimize the policy using deep models with some success but lack the theoretical guaranties. Our methods provide a framework to incorporate classical reinforcement learning algorithms which provide guaranties. Moreover, the method can be extended using deep models to solve more specific problems. The...
    Categories:
  • We encounter forged image on a daily basis. From image enhancement on Instagram, Snapchat, etc. to more sophisticated forgeries done in an advanced platform like Photoshop and GIMP, image editing is no longer a professional game. Most forgeries are clear even to the naked eye but with few hours of YouTube tutorials, one can fool most people with his forgery skill. The problem in hand results from the combination of...
    Categories:
  • The sinking of the RMS Titanic is one of the most infamous shipwrecks in history.  On April 15, 1912, during her maiden voyage, the Titanic sank after colliding with an iceberg, killing 1502 out of 2224 passengers and crew. This sensational tragedy shocked the international community and led to better safety regulations for ships. One of the reasons that the shipwreck led to such loss of life was that there...
    Categories:
  • Machine learning has gained a lot of interest in the last decade, especially due to impressive advances in deep learning. A typical assumption in machine learning is that the data is i.i.d. from some unknown data distribution. However, in many real-world domains this assumption does not hold, and instead we have some temporal structure in the data. In such cases, it is known that standard optimization algorithms (e.g., SGD) suffer...
    Categories:
  • Lately, Deep Learning (DL) is a core component in solutions for many problems. That said, Reinforcement Learning (RL) is also a domain who benefit from Deep Learning, so-called Deep Reinforcement Learning (DRL). one successful DRL network called Deep Q Network (DQN). DQN concentrate on solving a specific set of Atari games. DQN excel in winning those games. Most real-world problem and computer games, are non-markovian. It means that a given...
    Categories:
  • Graph Neural Networks have fastly grown popularity in recent years due to their ability to learn non-pixel data representations. However, their robustness to noisy data or other kinds of perturbations is still not adequately explored. In this project, we will investigate various adversarial attacks and hopefully proposed methods to increase network robustness for their elimination. Related work: https://arxiv.org/pdf/1805.07984.pdf, https://arxiv.org/abs/1809.01093, https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=7974879
    Categories:
  • CW/DDN are adversarial attack methods that aim to compute the minimal epsilon for which an adversarial attack exist. However, their computational time is quite high. Both methods work by local search of the minimal epsilon. In this project we therefore wish to examine an approximate of the minimal epsilon in order to reduce the computational complexity. We can estimate the epsilon according to the model accuracy on randomly corrupted data.
    Categories:
  • MuZero is a recent reinforcement learning (RL) algorithm that learns how to plan by combining ideas from the planning and the RL communities. In this project we would like to investigate: how well can MuZero solve the motion planning problem? In the motion planning problem a robot needs to navigate between a start position to a goal position without colliding with obstacles along the way. This problem is particularly difficult...
    Categories:
  • The world we live in is advancing rapidly towards global usage of digital applications and wireless communication in all fields of life, starting with smart homes, through IoT and to autonomous cars. The growing usage rates of these digital applications brings many dangers and threats regarding mainly to security breaches and malicious usage of these apps – faking of data for deceiving security systems and for getting access to private...
    Categories:
  • In our Project we created a system that can independently download movies reviews, classify them and decide for each review if it is positive or negative. The process of getting the reviews is done by an independent script (crawler) that can search in the internet and download a large amount of reviews. For classifying each review, we created a trained module based on machine learning algorithms with different kinds of...
    Categories:
  • We consider a distributed reinforcement learning framework where multiple agents interact with the environment in parallel, while sharing experience, in order to find the optimal policy. At each time step, only a subset of the agents is selected to interact with the environment. We explore several mechanisms for selecting which agents to prioritize based on the reward and the TD-error, and analyze their effect on the learning process. When the...
    Categories:
  • NeuroEvolution is a field in AI which uses evolutionary algorithms and can be used to solve reinforcement learning problems. The process in those algorithms is initializing simple neural net (usually on hidden neuron only) and then performing evolution on the weights space to get the best fitness score. This project is based on a paper [1] that shows that evolution of the topology can improve the performance in control tasks...
    Categories:
  • Project’s main objective was to develop a methodology for solving planning problems involving time by using simple planning tools. During the project I studied the field of classical and temporal planning, learned the PDDL language which is used for describing planning problems and designed a method for solving temporal planning problems using classical planning solver written in python (Pyperplan).
    Categories:
  • In this project, we consider the Inverse Reinforcement Learning problem in Contextual Markov Decision Processes. Here, the reward of the environment depends on a hidden static parameter referred to as the context, i.e., each context defines an MDP. The agent does not observe the reward, but instead, is provided with expert demonstrations for different contexts. The goal of the agent is to learn a mapping from contexts to rewards that...
    Categories:
  • Convolutional neural networks (CNNs) compute their output using weighted-sums of adjacent input elements. This method enables CNNs to achieve state-of-the-art results in a wide range of applications such as computer vision and speech recognition. However, it also comes with the cost of high computational intensity. Shomron et al purposed exploiting the spatial correlation inherent in CNNs and predict activation values, thus reducing the needed computations in the network. They introduced...
    Categories: