Investigated various neural video compression algorithms and built a modular codebase that supports the plug and play usage of typical encoders, decoders and entropy models used in neural codecs.
Performed a comparative analysis (from a systems perspective) of ScaleSpaceFlow and WaveOne ELF-VC, two state-of-the-art end-to-end learned video codecs, both of which were reproduced using the codebase developed.
Developed a mathematical framework for studying neural networks which are trained or tested with compressed images for use cases in distributed autonomous driving data collection.
Designed dataset restoration, a principled algorithm motivated by the aforementioned framework, that utilizes conditional GANs to mitigate the drop in performance by 10-50% when compressed images are used for training in place of uncompressed images. Katakol et al., IEEE TIP ‘21
Developed techniques for few-shot adaptation ($\approx$ 20% improvement wrt baselines) and continual learning of learned image compressors in collaboration with BBC R&D. Katakol et al., CVPR-W ‘21
Improved existing techniques for representation learning of Chest X Rays by developing a specialized loss involving a weighted square loss component and a multi-instance learning based detection loss component.
The usage of this loss led to better representations that capture the intricate details in the X-rays and ultimately resulted in improved detection accuracies when the learnt representations are transferred to a new task.
Developed neural networks to approximate the solutions to system of differential equations. Forward propagation of a feed-forward network was augmented with an additional term to facilitate learning. Github }
Built a miniature Meta Search Engine, a system which refines the results of popular search engines according to the expected needs of middle/high school students, using document embeddings and clustering. Github
Spectral Normalization is the current state-of-the-art method for enforcing the Lipschitz constraint on the discriminator of the GAN. I was interested specially in the sensitivity of the discriminator to minor changes in the image input to it. While stability to small changes in the image is a desirable property of the discriminator, it’s not satisfied by the discriminator and some interesting trends are observed.
Contribution:
Detection & analysis of the problems with Spectral Normalized GANs & attempts to improve them to produce meaningful & coherent generations.
Designed a method to quantitatively estimate local and global coherence captured by the discriminator.
This an ongoing project where we use a different method to detect objects than the one used in Object-based Visual Reasoning . The paper uses pre-trained Faster RCNN architecture to detect CLEVR objects while we use traditional CV tools to detect objects, thus reducing training time and increasing the speed of reasoning at test time.
This was also the final project of Deep learning course at BITS Goa. I, being a Teaching Assistant for the course, mentored about 15 groups on the same.
This project was aimed to bring out the evolution of cooperation in our society by examining the cooperative hunting scenarios in lions extensively studied by biologists over the years. The objective was to simulate the random behaviour of animals and show how their actions converge/diverge under different conditions.
Contribution:
Through Nash Q-Learning we were able to teach multiple predators the desired behavior and were able to simulate three different scenarios under different conditions - one in which the animals fight, one in which they cooperate, and one in which they mix both these strategies.
Project component of Neural Network and Fuzzy Logic course where we had to classify sports videos from the UCF Dataset according to the action being performed.
Contribution:
Built a minimal model using Autoencoders, ConvNets, and LSTMs. The bagged model we submitted was of the least size.