About me

Machine Learning Engineer, specializing in Computer Vision with more than 3 years of practical experience in the industry and academic world. Building and maintaining real-time intelligent systems in a variety of challenges from face authorization and vision-based recommendation to multi-modal video indexing systems. Have worked in agile environments with teammates from 4 to 20 colleagues. Equipped with back and front-end technologies and hands-on experience for making production-level services for AI-based modules. Actively working on open-source projects on my GitHub. A mentor, buddy, and friend for newcomers in the workplace and programming institutions.

Work Experience

Computer Vision Research Engineer/h5>
May.2023 - present

@ Vyro, Islamabad, Iran

Missions:

Experimented with different transformer-based text encoders in lora training of stable diffusion models
Designed a background removal and replacement system based on stable diffusion models
Trained several models based on textual inversion method for defect detection in images
Built a fast visualization and exploration system for large imagery databases based on similarity indexing

Senior Machine Learning Engineer

Aug.2022- Jul.2023

@ Dotin, Pardis Tech Park, Tehran, Iran

Improved the pipeline’s accuracy by approximately 6%, reaching above 97% on well-known NSFW datasets in the literature
Optimized the pipeline's backbone to have at least 50% smaller with increased performance.

Refactored a massive amount of code and converting it to a clean reusable core module for deployment
Improved the overall precision and recall of the system by a margin of nearly 12% and 14% on 500 people
Improved the overall precision and recall of the system by a margin of nearly 7% and 10% on a in-house dataset including 1600 iranian celebrities

Implemented a BERT-based language model with a downstream task of spell checking on the Persian language
Designed an OOP-based interface for it to be deployable in any intellisense system.
Reached WER of lower than 3.5% on a thorough dataset of 20 million Persian sentences, outperforming the SOTA in the literature
Mentored and coached a teammate for PyTorch library in several sessions

Machine Learning Engineer

Apr. 2021- Sep. 2022

@ Medad-AI, Pardis Tech Park, Tehran, Iran

Collected over 10k samples and making an automated pipeline of labels for localization and recognition
Improved its accuracy more by more than 7% on cursive fonts by integrating Attention module
Several improvements on the speed and performance of localization module

Design a self-supervised approach for learning discriminative features by using contrastive learning and masked encoding
Improved human-voted relevancy factor, meaning the 20 first recommendations’ similarity to the query based on 10 people
Decreasing time load to 0.67 of the common implementations

Automatic and efficient translation of the benchmark datasets using scraping technologies
Trained a Seq2Seq model for translating different languages to Persian; this module was made to provide the dataset for train the image captioner

Implemented a novel Seq2Seq-based supervised approach for detecting the most important scenes of soccer match videos

Implemented a module for retrieving iranian celebrities face images for video-indexing purposes

Created several sound preprocessing modules, including human voice detector, sound diarization, and sound noise removal
Training and maintaining Wav2Vec and DeepSpeech for Persian language
Achieved 1.43 WER on “common voice” benchmark

Build a product recommendation system, integrating a visual features and meta specifications
Integrated Postgresql database and FAISS-based similarity search pipeline for the products
Providing a clean interface for the deployment

Machine Learning Engineer

Apr.2021-Feb.2023

@ Freelancer

Developed a cross-platform software using PyQt for a private hospital for recording medical data from Traumatic brain injury (TBI)
Equipped with features for training and evaluating different supervised machine learning-algorithms with a user-friendly UI
Implementing SQL database and the required connections
Testing and Maintaining

Designed a document reader for a law firm for automating customer services
Paragraph segmentation, signature extraction, OCR

Implemented a user-friendly GUI-based app for cataract detection for a private medical services center
Utilized light-weight deep models for better performance with low cost
Achieved a trustworthy performance of 98.87 TPR on local dataset