profile image

BINOY SAHA

Indian Institute of Technology Madras
M.S. (by Research) in CSE

Download Resume

About Me

I am a technology enthusiast having several years of experience in software development. Currently, I am more inclined towards the field of Artificial Intelligence and wish to work in this field in the future. I started learning AI in the final year of B.E. and involved myself in AI projects.

The field of AI has always inspired me to brainstorm ideas for the automation of laborious tasks. I am intrigued about this field, as it makes me ponder why existing machine learning models behave the way they do and how can their inner working mechanisms be made more interpretable. My passion for diving deep into concepts and involving myself in AI-related projects aimed at solving real-world problems has driven me to opt for research in this field. Becoming an active researcher, I wish to work on improving deep learning models and come up with useful applications.

Publications

Stutter Diagnosis and Therapy System based on Speech Processing and Deep Learning

Gresha Bhatia*, Binoy Saha, Mansi Khamkar, Ashish Chandwani, Reshma Khot; In 13th INDIACom-2019, 6th International Conference on Computing for Sustainable Global Development [* - Mentor]

  • Attempted to detect and classify stutters in the input audio while existing works focused only on detection of stutters.
  • Trained a Gated Recurrent CNN on MFCC audio features for stutter detection and classification.
  • Proposed an SVM-based system that can suggest therapies based on the type and severity of the stuttering.
  • Developed an Android app, Node.js based server and exploited Firebase for storing API requests and responses.
View Paper View Code

Key Projects


Detecting places of hideout

Advisor: Prof. Sukhendu Das - IITM

  • Proposed two novel feature-level loss functions for self-supervision of the feature extractor to make it invariant to color transformations and equivariant to affine transformations.
  • Developed a novel decoder block to extract relevant depth features from only an RGB image as input.

Scene Understanding based on Visual Intelligent System

Advisor: Prof. Sukhendu Das - IITM

  • Worked on Maximal free-space direction estimation. Floor vs non-floor segmentation map, depth map, and several image processing techniques were used. Lightweight deep learning models were used in order to deploy the entire module on a robot GPU.
  • Worked on an ontology-based visual question answering system, where cues from scene graph, depth map, and segmentation map were used to answer a predefined set of questions.

Adjustable Autonomy based on Cognitive Workload

Advisor: Dr. Sushil Chandra - INMAS DRDO, Delhi

  • Developed a simulation based on neuropsychological tests with progressively increasing levels of difficulty.
  • Recorded EEG signals of several subjects to study how cognitive load affects their performance in the simulation.
  • Trained a drone operator model using the toolkit named ml_agents provided by Unity.

Golden Hour Response

Project Deep Blue - Mastek & Majesco, Mumbai

  • Developed a website using MEAN stack to help users get medical assistance at the earliest. Webview was used to embed the website into an android app.
  • The application contained chat module, alert system and many other helpful features.

Smart Mirror

SPRDH Project - VESIT, Mumbai

  • Designed a mirror that displayed useful customizable information like weather details, current time, cricket score, quotes to the person standing in front of it.
  • The mirror used face recognition for user authentication and made several API calls to fetch data.

Image Captioning

CS6910 Deep Learning: Prof. C. Chandra Sekhar

  • Implemented a captioning model with CNN (VGG16) based encoder and single layer unidirectional RNN/LSTM based decoder. NetVLAD was used for feature aggregation.
  • Compared performance of RNN based decoder with LSTM based decoder using BLEU score as the evaluation metric.

Machine Translation

CS6910 Deep Learning: Prof. C. Chandra Sekhar

  • Machine translation [English to Tamil] using LSTM: Single-layer unidirectional LSTM was used as both encoder and decoder. Attention weights were calculated using additive attention mechanism.
  • Machine translation using transformer model.

Image Classification

CS6910 Deep Learning: Prof. C. Chandra Sekhar

  • Trained Multi-Layer Feedforward Neural Network (MLFFNN) for classification, with Deep CNN features as input.
  • Trained stacked autoencoder and stacked RBM. Then used encoder weights for initializing MLFFNN.

Speaker Verification

CS5691 Pattern Recognition and Machine Learning: Prof. Hema A. Murthy

  • Developed text-independent speaker verification system based on GMMs and FLDA, using NIST SRE’03 M dataset.

Continuous Digit Recognition using Discrete Concatenated HMMs

CS5691 Pattern Recognition and Machine Learning: Prof. Hema A. Murthy

  • Performed isolated digit recognition by training discrete Hidden Markov Models (HMMs) on recorded audio clips.
  • Used concatenated HMMs to perform continuous digit recognition.

Other Projects

Image De-noising using loopy belief propagation, Comparative study between eigenvalue and singular value decomposition, Least square regression, Ridge regression, Bayesian Classifiers, GMM, HMM, DTW.

More on GitHub