SIPL Projects

Machine & Deep Learning

Classify ECG Time Series Using Wavelet Analysis and Deep Learning
Picture for Classify ECG Time Series Using Wavelet Analysis and Deep Learning
2024
Student/s: Doron Hanuka (Part A+B), Coral Kashti (Part A only)
Supervisor/s: Dr. Meir Bar-Zohar
The goal of this work is to develop a system that classifies ECG signals into two categories: arrhythmias (ARR) and normal sinus rhythm (NSR). Upon receiving an ECG signal from a subject, the system operates as follows: The temporal signal is divided into windows, resulting in a time series of windows. A Wavelet Transform is applied to each window to obtain a time-frequency representation for the time segment within the window. Features are extracted from the windows using a convolutional network trained for this task, yielding a time series of features. Predictions are made on this time series using an LSTM network, providing a prediction of the subject's cardiac condition.
Estimating BMI from 2D Image
Picture for Estimating BMI from 2D Image
2024
Student/s: Tzvi Tal Noy, Ido Sagi
Supervisor/s: Nurit Spingarn
The BMI index is a crucial index which gives a quantitative assessment of whether a person is in normal weight, underweight or overweight. The index is calculated using height and weight data. The purpose of our project is to estimate a person's BMI from a single 2-dimensional image. This is a complex task because visual inspection of the image is not sensitive to the distance of the object from the camera and the angle of the shot. To approach this task, we relied on previous works in the field, on their results, and on the dataset published with them.
Biometric Authentication Using PPG Signals
Picture for Biometric Authentication Using PPG Signals
2024
Student/s: Inbal Ben Yehuda, Shany Danino
Supervisor/s: Yair Moshe
Today, the challenges of security and information safety are substantial, necessitating the development of high-quality and reliable verification methods. The use of biometric authentication methods is expanding as they provide secure and convenient means of verification. In this project, we explore the potential of using the PPG signal as a unique biometric authentication method. This signal represents changes in blood volume during cardiac cycles. Each individual’s PPG signal is influenced by a unique combination of physiological characteristics. This uniqueness allows us to use the PPG signal as a sort of "fingerprint" to identify the person.
Music Genre Classifier Using Deep Learning Networks
Picture for Music Genre Classifier Using Deep Learning Networks
2024
Student/s: Ilay Yavlovich, Amit Karp
Supervisor/s: Hadas Ofir
In this work, we created a model based on deep neural networks to classify music genres. During the process, we segmented each song into song excerpts and fed them into the model for training, validation, and testing. Throughout the project, we utilized the classification of songs with a single genre from the MTG-Jamendo song database, which involved working with a database of songs divided into genres in an unbalanced manner (the number of songs from each genre varies significantly). Therefore, we chose to work only with the ten largest genres and used different weighting schemes in hopes of improving the results.
Depth-Based Semantic Segmentation for Four–Legged Robot
Picture for Depth-Based Semantic Segmentation for  Four–Legged Robot
2024
Student/s: Shany Cohen, Tal Sonis
Supervisor/s: Yair Moshe
Logo of RAFAEL Collaborator
This work’s goal is to enable maneuvering abilities for a four-legged robot in an indoor environment by employing semantic segmentation. The segmentation is performed using deep learning, based on low-resolution grayscale and depth images captured by a Pico Flexx camera mounted atop the robot. While most existing semantic segmentation methods rely on RGB and depth images, there are no pre-trained models specifically designed for grayscale images. In the project, we adapted an architecture intended for semantic segmentation using RGB and depth images, leveraging transfer learning to tailor it to our specific requirements.
Emotional Speech Synthesis
Picture for Emotional Speech Synthesis
2024
Student/s: Sagi Eyal, Loren Tzveniashvily
Supervisor/s: Yair Moshe
Logo of Elbit Systems Collaborator
The goal of this work was to perform emotional speech synthesis. First, we experimented with the emotional voice conversion approach, where the system receives two voice signals and transfers the emotion from one recording to another. Later in the project, we focused on the emotional text-to-speech approach, where the system receives the transcription of the sentence we want to synthesize and the desired emotion and generates a recording of the desired sentence with the given emotion. First, we reproduced the results of the EmoSpeech system, which converts text to emotional speech in a fast and high-quality manner.
Characterizing Pedestrians in Parks
Picture for Characterizing Pedestrians in Parks
2024
Student/s: Shany Zehavy, Adi Levy
Supervisor/s: Ori Bryt
Collaborator default image
This work aims to address the pressing need for high-quality public open spaces in urban environments, with a focus on leveraging computer vision and deep learning techniques. The COVID-19 pandemic has emphasized the importance of public open spaces in enhancing the well-being and quality of life for city dwellers. It has become evident that these spaces serve as vital elements in urban landscapes and play a significant role in promoting physical and mental health, social interactions, and overall community resilience.
Deep Learning for Multiple Virus Detection Tests Using Sparse Genome Reads
Picture for Deep Learning for Multiple Virus Detection Tests Using Sparse Genome Reads
2024
Student/s: Eran Yermiyahu, Michal Maymon
Supervisor/s: Zuher Jahshan
This work focuses on the identification of diverse respiratory diseases through the utilization of advanced machine learning tools and neural networks applied to genomic sequences. The primary objective of our study is to develop a rapid and cost-effective diagnostic tool capable of detecting a range of respiratory illnesses and identifying the different variants of each disease. The urgency for accurate diagnosis of various respiratory diseases has become paramount, particularly considering the ongoing global COVID-19 pandemic. Additionally, the presence of comorbidities significantly heightens the risk of life-threatening complications.
Calibration of Deep Neural Networks
Picture for Calibration of Deep Neural Networks
2024
Student/s: Yonatan Leibovich, Avichay Ashur
Supervisor/s: Yair Moshe
Deep Neural Networks (DNNs) are a type of learned functions which consist of multiple layers between the input and output layers. These layers consist of neurons that are connected to each other, transmitting information from their input to their output. A widespread use of DNNs is to learn to classify complex data by learning from a set of labeled. It has been shown that DNNs suffer from miscalibration i.e. misalignment between predicted probabilities and actual outcomes. For example, if we have 100 samples that the DNN is 90% confident about, we expect the network to correctly classify 90 of these samples and make mistakes in 10 of them.
Image guided image generation using stable diffusion and CLIP
Picture for Image guided image generation using stable diffusion and CLIP
2024
Student/s: Ido Blayberg, Ohad Amsalem
Supervisor/s: Noam Elata
In recent years, AI-driven image editing has emerged as a promising field with numerous applications. This work explores the capabilities of the generative AI model, Diffusion, for image editing guided by reference images. We focus on leveraging a set of images that outline the desired editing features and applying them to a target image. By experimenting with various hyper-parameters, modifying the core components of the Diffusion model, and integrating the CLIP model, we demonstrate various improvements in image editing performance.
SAR Target Classification Using Deep Learning
Picture for SAR Target Classification Using Deep Learning
2024
Student/s: Avichay Ashur
Supervisor/s: Dr. Meir Bar-Zohar
This work’s goal is to develop a classifier based on convolution neural network to classify Synthetic Aperture Radar (SAR) targets using deep learning. Deep learning is a powerful technique that can be used to train robust classifiers. It has shown its effectiveness in diverse areas ranging from image analysis to natural language processing. These developments have huge potential for SAR data analysis and SAR technology in general, slowly being realized. A major task for SAR-related algorithms has long been object detection and classification, which is called automatic target recognition (ATR).
Digitizing The Yerushalmi Catalogue
Picture for Digitizing The Yerushalmi Catalogue
2024
Student/s: Rami Halabi, Salah Abbas
Supervisor/s: Ori Bryt
Joseph Yerushalmi, a librarian at the University of Haifa Library, created a catalogue with around 65,000 records on paper cards. The catalogue contains articles from the 1940s to the 1970s, focusing on individuals like artists, writers, philosophers, intellectuals, and historical figures. the collection also includes reviews on books and literary works.   To preserve this valuable catalogue, digitization is needed, the project is divided to two parts: The first part is to Detect text regions, which means classifying each region to its appropriate label: Title, Author, Text, and other.
Bass Generation Based on Vocals via Deep Learning
Picture for Bass Generation Based on Vocals via Deep Learning
2024
Student/s: Dror Tiferet and Rom Ben Anat
Supervisor/s: Hila Manor & Gal Gershler
This work aims to create a bass accompaniment track for a solo vocal track. This is achieved using a machine learning model trained on a comprehensive dataset of bass and vocal tracks that sound good together. The system first processes the vocal track and converts it into a spectrogram, a graphical representation of the signal's frequency spectrum over time. A generative diffusion model is then used for producing a corresponding bass track that aligns with the input vocal spectrogram as a conditioning input to the model. Throughout the project, an extensive literature review was conducted to select appropriate models, including mel-gan and HIFI gan.
Advanced analysis methods for dynamics of functional connectivity in the brain during learning
Picture for Advanced analysis methods for dynamics of functional connectivity in the brain during learning
2024
Student/s: Gali Eytan and Ariel Engelman
Supervisor/s: Dr. Hadas Benisty
Recent studies have shown that motor learning entails the dynamic reorganization of functional connectivity in the brain’s neural networks. This work investigates these dynamics by analyzing the layer 2-3 pyramidal neurons of the motor cortex in mice during motor task learning, drawing inspiration from related studies on VTA (ventral-tegmental area) dopaminergic projections and their influence on network plasticity. Both prior research and this work employ the diffusion map algorithm with Riemannian distances to effectively reduce the dimensionality of neural activity correlations.
Depth Maps Quality Assessment Using Deep Features
Picture for Depth Maps Quality Assessment Using Deep Features
2023
Student/s: Amit Shpigelman, Simcha Lipner
Supervisor/s: Ori Bryt
Within the field of Image Quality Assessment (IQA), there are a few main methods of producing an evaluation. One such method, which this project focuses on, is a Perceptual index meaning a measure of a photos visual quality, what looks good to human eyes. Our goal is to create such a measure, which can assist Deep Neural Networks to perform various tasks upon depth images. Examples for tasks can be classification, denoising, compression and reconstruction, etc. Our work includes an attempt to create such a measure, using the responses of a DNN we design for different datasets.
Image Reconstruction from Deep Diffractive Neural Network
Picture for Image Reconstruction from Deep Diffractive Neural Network
2023
Student/s: Iggy Segev Gal, Tamar Sde Chen
Supervisor/s: Matan Kleiner
Deep diffractive neural networks have emerged as a promising framework that combines the speed and energy efficiency of optical computing with the power of deep learning. This has opened new possibilities for optical computing suit for machine learning tasks and all-optical sensors. One proposed application of this framework is the design of a diffractive camera that preserves privacy by only imaging target classes while optically erasing all other unwanted classes. In this work, we investigated whether this camera design truly erases the information in unwanted class data. We used K-NN to achieve up to 94% accuracy in classifying optically erased images.
Identification of Dairy Cows
Picture for Identification of Dairy Cows
2023
Student/s: Kfir Bendic, Itzhak Mandelman
Supervisor/s: Ido Cohen
Classifying dairy cows is a critical operation for dairy farms. The primary goal of dairy farms is to maximize milk production, which is achieved by monitoring various aspects of each cow, including milk yield, health status, estrus time, and other characteristics. Therefore, the foremost objective of a dairy farm is to establish a reliable method for identifying each cow accurately. Currently, common methods for cow identification rely on permanent measures such as ear or back tattooing, as well as the use of ear tags equipped with radio frequency identification (RFID) technology. However, these methods have limitations as they can fade, fall off, or break over time.
Acoustic Scene Classification
Picture for Acoustic Scene Classification
2023
Student/s: Shira Lifshitz, Ellinor Elimeleh
Supervisor/s: Dr. Meir Bar-Zohar
This work deals with acoustic scene classification on a dataset published in the DCASE2017 challenge. The goal is to achieve better performance than the performance presented in the challenge, using neural networks and mel-spectrogram features. We present the processing of the dataset, the classifier and models, and the selected hyperparameters. The best performance was obtained using mel-spectrogram features, an EfficientNet V2 S neural network, and a MiniNet net as selection algorithm. Accuracy of 83.33% was achieved, which is higher than the performance to which we compare the results.
Classification of Heart Sounds Using Deep Convolutional Networks
Picture for Classification of Heart Sounds Using Deep Convolutional Networks
2023
Student/s: Shlomi Zvenyashvili, Arik Berenshtein
Supervisor/s: Dr. Meir Bar-Zohar
Heart cardiovascular disease is a leading cause of death globally, with over 17 million deaths each year according to the World Health Organization (WHO). Accurate classification of heart sounds is crucial for early detection and effective management of heart conditions. However, this task is challenging due to the complexity of heart sound data, which includes variations caused by low quality recordings and differing physiological conditions. Robust and efficient models are needed for handling such diverse data and improving diagnostic accuracy. In this work, we propose a machine learning-based solution using Deep Convolutional Networks.
Recognizing Autism in Mice by Analyzing Their Squeaks
Picture for Recognizing Autism in Mice by Analyzing Their Squeaks
2022
Student/s: Itamar Ginsberg, Alon Schreuer
Supervisor/s: Dr. Dror Lederman, Prof. Hava Golan
Diagnosis of autism at an early age is an extensive area of research, as it has a massive impact on the ability to treat and aid those suffering from the syndrome. So far diagnosis has been based on professional behavioral observation, a flawed tool since it is subjective and imprecise, but also due to the fact that it is only effective at a late developmental stage (age 4-5 years). The goal of this work is to develop a diagnostic-assist tool for classifying mice into two categories: mice with symptoms of ASD (Autism Spectrum Disorder) and mice without such symptoms, based on recordings of their squeaks.
Deep Learning Based Target Cancellation for Speech Dereverberation
Picture for Deep Learning Based Target Cancellation for Speech Dereverberation
2022
Student/s: Neriya Golan, Mikhail Klinov
Supervisor/s: Yair Moshe, Baruch Berdugo
Background noise and reverberation can degrade the quality of speech signals and reduce their intelligibility. Reverberations also reduce the performance of important systems such as hearing aids or voice recognition applications. There are a variety of classic methods for dereverberation of speech signals, but their performance is usually unsatisfactory and not generalizable. In light of this, there has been an increase in recent years in research on dereverberation using modern methods based on deep learning.
Textual Explorable Super Resolution
Picture for Textual Explorable Super Resolution
2022
Student/s: Noam Elata, Rotem Idelson
Supervisor/s: Tomer Michaeli, Yuval Bahat
In this work, we developed an explorable Super-Resolution model, which generates a high-resolution image that is both consistent with the original low-resolution image and consistent with the semantic information desired by the user. The control of the image exploration is obtained by using a text prompt that is processed for its semantic information using CLIP network. We have investigated several methods of performing the above task; We first attempted expanding an existing explorable Super-Resolution network to optimize over the semantic information in the text.
Image Colorization for Thermal Mobile Camera Images
Picture for Image Colorization for Thermal Mobile Camera Images
2022
Student/s: Idan Friedman, Tomer Lotan
Supervisor/s: Ori Bryt
Thermal image colorization is a topic that is gaining momentum in the world of artificial intelligence. In recent years, with a significant improvement in the tools, and with the algorithmic development of deep learning, the world of computer vision has managed to achieve impressive achievements in everything related to image processing and analysis. A significant development that has led to the rapid increase in achievements is the development of the CNN called GAN - Generative Adversarial Network. Networks of this type make it possible to produce a new set of information based on the characteristics of the existing information.
Speech-to-Singing Conversion Using Deep Learning
Picture for Speech-to-Singing Conversion Using Deep Learning
2022
Student/s: Omri Jurim, Ohad Mochly
Supervisor/s: Yair Moshe, Gal Greshler
The purpose of this work is to develop an algorithm for converting speech to singing using deep learning methods. The system can help memorize various short texts like phone numbers and lists, as well as for entertainment. There are research papers on the subject that are based on classical signal processing methods as well as works that are based on deep learning, but so far (while working on this project) no results have been achieved that preserve speech content so that it is understandable and humane, along with converting it to desired melody.
Voice Disorder Detection via Deep Learning
Picture for Voice Disorder Detection via Deep Learning
2022
Student/s: Yiftach Edelstein, Chen Katzir
Supervisor/s: Hadas Ofir, Dr. Ariel Roitman
The project deals with the diagnosis of various voice pathologies related to the throat and vocal cords which today can only be diagnosed by a long and multi-stage process that includes listening to the patient's voice by an otolaryngologist specialist and then an invasive examination using special equipment. We assume that there is plenty of information about those pathologies in the voice recordings of the subjects, and therefore we wish to use them to design a simpler diagnosis procedure that is based on machine-learning algorithms.
Prediction of Anesthesia Depth based on EEG Signals
Picture for Prediction of Anesthesia Depth based on EEG Signals
2022
Student/s: Nadav David, Isaac Ben-David
Supervisor/s: Hadas Ofir, Ya-Wei Lin
Logo of NervIO Collaborator
Spinal Surgery is a high-risk procedure with sever potential complications including paralysis and permanent sensory loss. Most of these complications are preventable or can be mitigate using Intra-Operative Neuromonitoring. The field of IONM is rather new, but it is rapidly becoming a standard-of-care in neurosurgery, orthopedics and ENT (ears, nose, throat) procedures. During neuromonitoring of a case, relevant bio-signals are recorded and processed prior to and during the surgery, by which the neurophysiologists can detect pending neurological insults. EEG is one of the most important bio-signals in neuromonitoring, allowing to assess the depth on anesthesia.
Image Denoising Using CNN Autoencoder
Picture for Image Denoising Using CNN Autoencoder
2021
Student/s: Avihu Amar, Gil Barum
Supervisor/s: Dr. Meir Bar-Zohar
In this work we show a practical solution for image denoising using CNN Autoencoder Neural Network. The network we built is easy to implement and provide relatively high performance when compared to other classic methods like BM3D, and even compared to other, more complex networks. This network is also very flexible and can be adjusted to match different memory capacity of the graphic cards available for the training. We show how we use a relatively simple design and improve it by using custom performance metrics designed to evaluate images, replacing standard layers like MaxPool and UpSampling with convolutional layers, implementing custom loss functions and comparing between them.
Error Resilient Real AdaBoost
Picture for Error Resilient Real AdaBoost
2021
Student/s: Asaf Goren, Da-El Klang
Supervisor/s: Yuval Ben-Hur
AdaBoost is a binary classification algorithm that combines several weak classifiers into one strong classifier. This algorithm has relatively good results, even for nearly random base classifiers. Ever since it was published, many variants of the algorithm have been developed for different specific cases. In this project, we focus on a specific version of the algorithm, Real AdaBoost, in which the output of each weak classifier is a real number. Each number represents the confidence level of the classifier in the specific classification decision, and the final classification result is the sum of outputs of all the classifiers.
Ultrasonic Water Meter Calibration by Deep Learning
Picture for Ultrasonic Water Meter Calibration by Deep Learning
2021
Student/s: Tamir Bitton
Supervisor/s: Hadas Ofir
Logo of Arad Technologies Collaborator
Water meter calibration is an essential process in order to maintain the performance of a meter, but it is a complicated process. The process contains several stages, when each step includes sampling many measurements from the uncalibrated meters and error calculation for each measurement. Consequently, the calibration process is very slow and expensive. The goal of this project is to significantly shorten the calibration time using a deep learning based method for predicting the results, while maintaining a certain bound on the prediction error. The system uses a dataset provided by ARAD technologies, that contains calibration factors of different water meters.
Mark of Award this ProjectImage Manipulation with GANs Spatial Control
Picture for Image Manipulation with GANs Spatial Control
2021
Student/s: Karin Jakoel, Liron Efraim
Supervisor/s: Tamar Rott
We suggest a new approach that enables spatial editing and manipulation of images using Generative Adversarial Networks (GANs). Though many tasks have been solved utilizing the powerful abilities of GANs, this is the first time that a spatial control is suggested. This ability is possible thanks to a test-time spatial normalization that uses the trained model as is and does not requires any fine tuning. Therefore our method is significantly fast and does not required further training. We demonstrate the new approach for the task of class hybridization and saliency manipulation.
Creating Image Segmentation Maps Using GANs
Picture for Creating Image Segmentation Maps Using GANs
2021
Student/s: Inbal Aharoni, Shani Israelov
Supervisor/s: Idan Kligvasser
The use of GAN has drastically affected low-level vision in graphics, particularly in tasks related to image creation and image-to-image translation. Today the training process, despite all the latest developments, is still unstable. Given a semantic segmentation map in which we can separate and look at each pixel in the image and tag it to the relevant class it represents, we can (with the help of GAN) produce images based on this map and hope to reach a more stable model. With the success of GANs we produced segmentation maps. With these maps and with the help of the generative model we can get a semantic understanding of the data set and even create completely new scenes.
SinGAN for Temporal Super-Resolution
Picture for SinGAN for Temporal Super-Resolution
2021
Student/s: Tomer Arama, Itay Shemer
Supervisor/s: Tamar Rott Shaham
Super resolution in images and video is a complex task that represents an array of different perceptual abilities, from object recognition to movement flow recognition. SinGANs architecture showed that SOTA super resolution from a single training image (without priors) is possible. TSR is an architecture that performs temporal super resolution on videos, that showed SOTA performance on a single training video. In this project we tried modifying SinGANs architecture and explore its ability to generalize its super resolution capabilities to 3D data videos, the main difference from TSRs architecture is our usage of GANs and the adversarial training scheme.
A Random-Projection Based Approach for Generative Modelling
Picture for A Random-Projection Based Approach for Generative Modelling
2021
Student/s: Elad David
Supervisor/s: Prof. Tomer Michaeli
Generative models have been widely studied in recent years using large and costly DNN-based models. Yet, results still have much room for improvement in terms of both accuracy and runtime. In our work, we aim to tackle the Generative Modeling problem using a different, computationally lighter approach, based on an iterative fitting process between marginals of source and target distributions. Intuitively, one can think of this process as an analogue to the process of Tomography where each direction of observation adds information of the objects density. In this report, we formulate the underlying theory, demonstrate the algorithms performance, and analyze its abilities and weaknesses.
Features Extraction for Classification of Dolphin Sounds
Picture for Features Extraction for Classification of Dolphin Sounds
2021
Student/s: Harel Plut, Or Cohen
Supervisor/s: Dr. Roee Diamant
Logo of ANL Haifa Collaborator
With the large increase in human marine activity, our rivers and seas have become populated with boats and ships projecting acoustic emissions of extremely high power that often affect areas of up to 20 square km and more. The underwater radiated noise (URN) level from large ships can exceed 100 PSI and is wideband, such that even at km distances of several kilometres from the vessel, the acoustic pressure level is still high. While evidence showed evidence for a clear disturbance impact on the hearing and behavior of marine mammals, there is still no systematic proof to the extent of this effect.
Mark of Award this ProjectGenerative Deep Features
Picture for Generative Deep Features
2021
Student/s: Hila Manor, Da-El Klang
Supervisor/s: Tamar Rott Shaham
The goal of this work is to research the capability of generating a completely new image with the same visual content of a single given natural image, by using unsupervised learning of a deep neural network without the use of a GAN. This project is based on the work presented in the paper: "SinGAN: Learning a Generative Model from a Single Natural Image" (Rot-Shahamet al.) Different papers published in the last couple of years have already established the connection between the deep features of classification networks and the semantic content of images, such that we can define the visual content of an image by the statistics of its deep features.
Deep Learning Based Image Processing for a Smartphone Camera
Picture for Deep Learning Based Image Processing for a Smartphone Camera
2021
Student/s: Alexey Golub, Yanay Dado
Supervisor/s: Dr. Meir Bar-Zohar
In the first part of the project, our focus was on the PyNET network. This network was designed to replace the full ISP pipeline, which is responsible for the conversion of raw information detected by a digital camera sensor (known as a Bayer image or a RAW image) into the color image seen on the screen (of the DSLR camera, of the smartphone, etc.). Specifically, we tested different loss functions in order to improve PyNETs performance. In the second part of the project, we explored additional ways to improve this performance.
Voice DeepFake
Picture for Voice DeepFake
2021
Student/s: Idan Roth, Zahi Cohen
Supervisor/s: Yair Moshe
The goal of this work is to design a method for performing voice conversion between two speakers. The method employs deep learning techniques, particularly autoencoder architecture, to convert the source speakers voice into the target speakers voice while preserving the source speakers linguistic content. The baseline model architecture is VC-AGAIN. This model uses a one-shot approach. In this approach, it is sufficient to receive in the inference stage a single speech signal from the source and target speakers, on whom the system has not been trained, in order to perform voice conversion.
Seeing Sound: Estimating Image From Sound
Picture for Seeing Sound: Estimating Image From Sound
2021
Student/s: Sagy Gersh, Yahav Vinokur
Supervisor/s: Tamar Rott Shaham, Idan Kligvassser
The goal of this work is to train a neural network so that it will reconstruct an image of an audio signal input source. Under the assumption that the audio signal contains enough features of the image that created it, we tried to use an audio classifier to extract those features and transform them to a features vector, from which we can reconstruct the audio source image using GAN. The transformation was achieved using a simple deep neural network, which has been successful in reconstructing images in a small domain (only 2 classes of image, audio pairs of musical instruments) training and test cases.
Mark of Award this ProjectGunshot Detection in Video Games
Picture for Gunshot Detection in Video Games
2020
Student/s: Amit Ben Aroush, Asaf Arad
Supervisor/s: Hadas Ofir
Logo of Waves Audio Collaborator
The projects goal is to build an automated system for real-time acoustic detection of gunshots in computer game scenarios, using deep learning. The system uses a neural network to detect gunshots. A few stages were in the development process: First, we constructed the dataset. During this work we tested several features and chose those who proved separation between audio segments that contain gunshots to those who do not. The second stage was to find a network suitable for the needs of the project and train it using our dataset, and to perform real-time test to detect gunshots. This solution was compared to traditional classifying methods, e.g.
Mark of Award this ProjectDeep Image Interpolation
Picture for Deep Image Interpolation
2020
Student/s: Navve Wasserman, Noam Rotstein
Supervisor/s: Tomer Michaeli
Images that describe the real world are naturally continuous functions that lose significant amount of information when transferring to the discrete digital world. Therefore, the ability to perform various actions on the digital image is required in order to complete missing information, improve the quality of the digital image, and preserve its natural appearance and properties. The classic method that is still used today in a wide variety of applications is interpolation. In the project we present a new method for interpolation using neural networks. The method uses a neural network to estimate the continuous function that describes each image.
Mark of Award this ProjectEarly Detection of Cancer Using Thermal Video Analysis
Picture for Early Detection of Cancer Using Thermal Video Analysis
2019
Student/s: Idan Barazani
Supervisor/s: Aviad Levis, Ori Bryt
Logo of HT Bioimaging Collaborator
Cancer is a major challenge to modern medicine, the disease has many victims everywhere in the world, therefore multiple efforts and resources are invested in the attempt for cancer annihilation. As part of the characterization of diseases in general and especially cancer, early detection has the potential to increase the patient's chances of recovery. The primary goal of the project - early identification of external cancer (tongue/ cheek /Lip) using cooling and heating patterns of these biological tissues.
Mark of Award this ProjectSpeaker Diarization using Deep Learning
Picture for Speaker Diarization using Deep Learning
2019
Student/s: Matanel Yaacov, Shay Avig
Supervisor/s: Nurit Spingarn
Speaker Diarization is a process of dividing a given sound segment or audio stream into segments based on the speaker's identity. This method is designed to answer the question "Who spoke and when?" And can be useful in many different cases where it is important to know the speaker's identity. For example, phone calls, radio interviews, podcasts, and even emergencies where recordings from the scene are investigated (black boxes in aircraft, etc. ...). Speaker Diarization is the well-known and famous method for segmenting audio segments by speaker identity, which until today has been implemented by classical algorithms from audio signal processing.
Mark of Award this ProjectPhysics Classroom Augmented Reality with Your Smartphone Part B
Picture for Physics Classroom Augmented Reality with Your Smartphone Part B
2019
Student/s: Georgee Tsintsadze, Yonatan Sackstein
Supervisor/s: Yair Moshe
The project, Physics Classroom Augmented Reality with Your Smartphone, is the second project having the same goal as the previous one creating an android app that will allow, using a photo of a drawing of a physical system, to create a running simulation of the said physical system. This project uses classic image processing algorithms and animation programming tools. The project is based on a previous one that detects, classifies and localizes objects in an image. The first stage of the project was to create an application for presenting a simple animation of a physical interaction.
Mark of Award this ProjectDeep Learning for Physics Classroom Augmented Reality App
Picture for Deep Learning for Physics Classroom Augmented Reality App
2019
Student/s: Tom Kratter, Yonatan Sackstein
Supervisor/s: Yair Moshe
The project Deep Learning for Classroom Augmented Reality Android App is a second project having the same goal as the previous one creating an android app that will allow, using an image of a drawing of a physical system, to create a running simulation of the said physical system. The goal of this project, similar to that of the previous project didnt succeed, and as part of the overall solution, is to classify and localize different objects in the drawing of the physical system. Our project tries (and usually succeeds) to do so using deep learning algorithms, as opposed to the previous project that tried and hasnt managed to do so using classic image processing algorithms.
Mark of Award this ProjectEfficient Deep Learning for Pedestrian Traffic Light Recognition
Picture for Efficient Deep Learning for Pedestrian Traffic Light Recognition
2019
Student/s: Roni Ash, Dolev Ofri
Supervisor/s: Yair Moshe
Crossing a road is a dangerous activity for pedestrians and therefore pedestrian crossings and intersections often include pedestrian-directed traffic lights. These traffic lights may be accompanied by audio signals to aid the visually impaired. In many cases, when such an audio signal is not available, a visually impaired pedestrian cannot cross the road without help. In this project, we propose a technique that may help visually impaired people by detecting pedestrian traffic lights and their state (walk/dont walk) from video taken with a mobile phone camera.
Optimizing Mutual Information in Deep Neural Networks
Picture for Optimizing Mutual Information in Deep Neural Networks
2018
Student/s: Adar Elad, Doron Haviv
Supervisor/s: Prof. Tomer Michaeli
The recently proposed information bottleneck (IB) theory of deep nets suggests that during training, each layer attempts to maximize its mutual information (MI) with the target labels (so as to allow good prediction accuracy), while minimizing its MI with the input (leading to effective compression and thus good generalization). To date, evidence of this phenomenon has been indirect and aroused controversy due to theoretical and practical complications. In particular, it has been pointed out that the MI with the input is theoretically infinite in many cases of interest, and that the MI with the target is fundamentally difficult to estimate in high dimensions.
Mark of Award this ProjectFrom Deep Features To Image Restoration
Picture for From Deep Features To Image Restoration
2018
Student/s: Ori Sztyglic
Supervisor/s: Tamar Rott, Idan Kligvasser
In recent years, the use of deep features as an image perceptual descriptor is very popular, mainly for measuring the perceptual similarity between two images. In the field of image restoration, this has proved to be very useful for tasks such as super-resolution and style transfer. In this project, we suggest a different direction: rather than using deep features as a similarity measure, we suggest using them to construct a natural image prior. This can be done by learning the statistics of natural image's deep features. Using this special prior, we can gain from both world: the deep one, and the "classic" one.
Video Classification Using Deep Learning
Picture for Video Classification Using Deep Learning
2018
Student/s: Ifat Abramovich, Tomer Ben-Yehuda
Supervisor/s: Dr. Rami Cohen
Much recent advancement in Computer Vision is attributed to large datasets and the ability to use them to train deep neural networks. In 2016 Google announced the publishing of a public dataset containing about 8-million tagged videos called YouTube-8M. In this project, we used this database to train several deep neural networks for tagging videos in a variety of categories. In the first stage, we downloaded 5000 videos for 5 different categories. Next, we trained two deep networks, with slightly different architectures, to tag a video into one of the five categories. One network uses the LSTM architecture and the other uses the BiLSTM architecture.
Pedestrian Traffic Light Recognition for the Visually Impaired Using Deep Learning
Picture for Pedestrian Traffic Light Recognition for the Visually Impaired Using Deep Learning
2018
Student/s: Idan Friedman, Jonathan Brokman
Supervisor/s: Yair Moshe
This project is a part of a series of projects carried out in SIPL dedicated to creating an Android application that will assist the visually impaired people with pedestrian traffic lights. The current project consists of two parts: 1. Recognition of pedestrian traffic lights in a single image taken with a mobile phone from a pedestrian perspective. We use the Faster RCNN object detector with transfer learning on more than 900 pedestrian traffic light images, and achieve 98% accuracy. 2. Using the recognition module from part 1 along with object tracking to detect light switches from red to green or vice versa, for improved recognition robustness. For this aim, we use the KCF object tracker.
Mark of Award this ProjectAdvanced Framework For Deep Reinforcement Learning
Picture for Advanced Framework For Deep Reinforcement Learning
2015
Student/s: Shai Rozenberg, Nadav Bhonker
Supervisor/s: Itay Hubara
This project is based on previous work done by Google Deep Mind, in which reinforcement learning was used in order to teach a computer to play computer games on an Atari 2600 game console, which was popular in the 70s and 80s. In our project, we build a more advanced learning environment thus supporting a more advanced game console, Super Nintendo Entertainment System (SNES), and by so more complex and stochastic computer games and with the proper modifications to the algorithm, improve the human-like behavior and decision process of the computer.