Binoy Saha

Work Experience

Researcher / Project Associate

Visualization & Perception Lab, IIT Madras (2019 - Present)

Implemented, managed, and co-ordinated the project outcomes along with three other members.

Research Intern

INMAS DRDO, Delhi (2018)

Developed a simulation based on neuropsychological tests with progressively increasing levels of difficulty.

Recorded EEG signals of several subjects to study how cognitive load affects their performance in the simulation.

Trained a drone operator model using the toolkit named ml_agents provided by Unity.

Software Engineering Intern

Accentiv India Pvt. Ltd, Mumbai (2017)

Developed a Hybrid mobile app.

Developed responsive mobile-first HTML template with specially designed e-commerce pages for close to native mobile experience on a responsive web app.

Software Engineering Intern

Reis Ltd (Startup), Mumbai (2017)

Developed responsive website for online food ordering having CMS and inventory management system.

Implemented App shell architecture and lazy loading for performance optimization and performed On-site Search Engine Optimization.

Web Developer

Computer Society of India - VESIT (2017)

Created the official website for CSI VESIT using Laravel framework.

Co-ordinated and executed technical events organized by the council.

Conducted workshops on PHP and JavaFX.

Technical Skills

Programming Languages

Python C C++ Java Bash Assembly HTML CSS PHP JavaScript

Frameworks

PyTorch Tensorflow OpenCV Laravel Codeigniter Bootstrap Node.js, Express Angular

Database Management Systems

MySQL MongoDB SQLite Firebase

Tools

LATEX Git Postman Unity Xampp

About Me

I am a technology enthusiast having several years of experience in software development. Currently, I am more inclined towards the field of Artificial Intelligence and wish to work in this field in the future. I started learning AI in the final year of B.E. and involved myself in AI projects.

The field of AI has always inspired me to brainstorm ideas for the automation of laborious tasks. I am intrigued about this field, as it makes me ponder why existing machine learning models behave the way they do and how can their inner working mechanisms be made more interpretable. My passion for diving deep into concepts and involving myself in AI-related projects aimed at solving real-world problems has driven me to opt for research in this field. Becoming an active researcher, I wish to work on improving deep learning models and come up with useful applications.

Publications

Stutter Diagnosis and Therapy System based on Speech Processing and Deep Learning

Gresha Bhatia*, Binoy Saha, Mansi Khamkar, Ashish Chandwani, Reshma Khot; In 13th INDIACom-2019, 6th International Conference on Computing for Sustainable Global Development [* - Mentor]

Attempted to detect and classify stutters in the input audio while existing works focused only on detection of stutters.
Trained a Gated Recurrent CNN on MFCC audio features for stutter detection and classification.
Proposed an SVM-based system that can suggest therapies based on the type and severity of the stuttering.
Developed an Android app, Node.js based server and exploited Firebase for storing API requests and responses.

View Paper View Code

Key Projects

Detecting places of hideout

Advisor: Prof. Sukhendu Das - IITM

Proposed two novel feature-level loss functions for self-supervision of the feature extractor to make it invariant to color transformations and equivariant to affine transformations.
Developed a novel decoder block to extract relevant depth features from only an RGB image as input.

Scene Understanding based on Visual Intelligent System

Advisor: Prof. Sukhendu Das - IITM

Worked on Maximal free-space direction estimation. Floor vs non-floor segmentation map, depth map, and several image processing techniques were used. Lightweight deep learning models were used in order to deploy the entire module on a robot GPU.
Worked on an ontology-based visual question answering system, where cues from scene graph, depth map, and segmentation map were used to answer a predefined set of questions.

Adjustable Autonomy based on Cognitive Workload

Advisor: Dr. Sushil Chandra - INMAS DRDO, Delhi

Developed a simulation based on neuropsychological tests with progressively increasing levels of difficulty.
Recorded EEG signals of several subjects to study how cognitive load affects their performance in the simulation.
Trained a drone operator model using the toolkit named ml_agents provided by Unity.

Golden Hour Response

Project Deep Blue - Mastek & Majesco, Mumbai

Developed a website using MEAN stack to help users get medical assistance at the earliest. Webview was used to embed the website into an android app.
The application contained chat module, alert system and many other helpful features.

Smart Mirror

SPRDH Project - VESIT, Mumbai

Designed a mirror that displayed useful customizable information like weather details, current time, cricket score, quotes to the person standing in front of it.
The mirror used face recognition for user authentication and made several API calls to fetch data.

Image Captioning

CS6910 Deep Learning: Prof. C. Chandra Sekhar

Implemented a captioning model with CNN (VGG16) based encoder and single layer unidirectional RNN/LSTM based decoder. NetVLAD was used for feature aggregation.
Compared performance of RNN based decoder with LSTM based decoder using BLEU score as the evaluation metric.

Machine Translation

CS6910 Deep Learning: Prof. C. Chandra Sekhar

Machine translation [English to Tamil] using LSTM: Single-layer unidirectional LSTM was used as both encoder and decoder. Attention weights were calculated using additive attention mechanism.
Machine translation using transformer model.

Image Classification

CS6910 Deep Learning: Prof. C. Chandra Sekhar

Trained Multi-Layer Feedforward Neural Network (MLFFNN) for classification, with Deep CNN features as input.
Trained stacked autoencoder and stacked RBM. Then used encoder weights for initializing MLFFNN.

Speaker Verification

CS5691 Pattern Recognition and Machine Learning: Prof. Hema A. Murthy

Developed text-independent speaker verification system based on GMMs and FLDA, using NIST SRE’03 M dataset.

Continuous Digit Recognition using Discrete Concatenated HMMs

CS5691 Pattern Recognition and Machine Learning: Prof. Hema A. Murthy

Performed isolated digit recognition by training discrete Hidden Markov Models (HMMs) on recorded audio clips.
Used concatenated HMMs to perform continuous digit recognition.

Other Projects

Image De-noising using loopy belief propagation, Comparative study between eigenvalue and singular value decomposition, Least square regression, Ridge regression, Bayesian Classiﬁers, GMM, HMM, DTW.

BINOY SAHA

Indian Institute of Technology Madras
M.S. (by Research) in CSE

About Me

Publications

Key Projects

Detecting places of hideout

Scene Understanding based on Visual Intelligent System

Adjustable Autonomy based on Cognitive Workload

Golden Hour Response

Smart Mirror

Image Captioning

Machine Translation

Image Classification

Speaker Verification

Continuous Digit Recognition using Discrete Concatenated HMMs

Other Projects