Skip to content
View Abdullah-Masood-05's full-sized avatar
🎓
Driven by data
🎓
Driven by data

Block or report Abdullah-Masood-05

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Abdullah-Masood-05/README.md

About

I'm a data scientist and software engineer who turns messy data and hard problems into systems that actually run. Most of my depth is in data science and machine learning — exploratory analysis, statistics, predictive modeling, and computer vision — paired with the full-stack engineering it takes to move a model from a notebook into production.

I like understanding the layer below the one I'm working on, whether that's the math behind a model or the service that serves it. Depending on what a problem needs, I move between data wrangling and analysis in Python, PyTorch training loops, and the APIs and backends that deliver the results.

Data science and machine learning are where I go deep. Software engineering is the throughline that keeps the work reproducible, and full-stack thinking is how I make it usable.

Open to  ·  Data Science roles  ·  AI / ML Engineering roles  ·  Software Engineering roles  ·  Open-source collaboration


Tech Stack

Languages

Frontend

Backend & Databases

AI / ML & Computer Vision


Cloud, DevOps & Tooling


AI / ML Expertise

Domain Proficiency Focus Areas
Computer Vision MediaPipe face mesh, YOLOv8 detection, OpenCV, real-time webcam inference
Deep Learning PyTorch, LSTMs, CNNs, time-series forecasting
Multimodal AI / LLMs CLIP embeddings, vision LLMs (Llama 4 Scout via Groq), RAG patterns
NLP & Embeddings Sentence-BERT, spaCy, semantic search
Classical ML & Data Science scikit-learn, predictive modeling, EDA, dashboards
MLOps & Serving ONNX, Docker, FastAPI model serving, Qdrant vector DB

Featured Projects

 Bisondb  —  A Document Database Built From Scratch (C++20)

A document-oriented database written in C++20 with no third-party storage or networking libraries. It implements a BSON storage engine, append-only collections, hand-written on-disk B+Tree indexes, a query engine with explain plans, a TCP server (bisond), and an interactive shell (bisonsh).

Aspect Detail
Stack C++20, custom BSON, on-disk B+Tree, TCP sockets, CLI shell
Scale Append-only collections with persistent indexing
Performance Hand-written B+Tree indexes and query explain plans
Security Zero external storage / networking dependencies, full control of the stack
Impact End-to-end database internals: storage, indexing, query, networking, shell
Repository Bisondb  ·  Project Site

This is the project I point to when someone asks whether I understand what a database does under the hood. Every layer, from how a document hits the disk to how an index lookup resolves, is something I wrote and can explain.

 Multimodal Product Intelligence  —  Vision LLM + Vector Search API

A FastAPI backend for AI-driven product analysis. It pairs Groq Vision (Llama 4 Scout) for image understanding with CLIP embeddings and Qdrant for semantic retrieval, exposed through a typed API and a separate frontend.

Aspect Detail
Stack FastAPI, Groq Vision (Llama 4 Scout), CLIP, Qdrant, Python
Scale Vector search across product image and text embeddings
Performance CLIP embeddings with Qdrant approximate-nearest-neighbour retrieval
Security API-key authentication and typed request / response schemas
Impact Combined image and text product understanding with semantic search
Repository API  ·  Frontend

A practical look at how vision models, embeddings, and a vector database fit together into one service rather than three disconnected experiments.

 Real-Time Driver Drowsiness & Distraction Detection  —  Computer Vision

A live driver-monitoring system that fuses three signals: MediaPipe face mesh for eye and head state, YOLOv8 for phone detection, and a PyTorch LSTM that reads the temporal pattern of those landmarks. A Streamlit UI runs it against a live webcam.

Aspect Detail
Stack MediaPipe, YOLOv8, PyTorch LSTM, OpenCV, Streamlit
Scale Live webcam, frame-by-frame inference
Performance Temporal LSTM over facial landmarks for drowsiness classification
Security Runs locally; video never leaves the device
Impact Driver-safety monitoring through multi-signal sensor fusion
Repository Driver Drowsiness Detection

Three models doing different jobs, joined into one real-time decision instead of a single classifier guessing at everything.

 TaskForge  —  Multi-Tenant SaaS Project Management Platform

A production-grade, multi-tenant SaaS platform built on Django and DRF. It covers the parts a real product needs: JWT auth, role-based access, Stripe billing, async notifications via Celery, real-time task updates over Channels, and a Dockerized CI/CD pipeline with a real test suite.

Aspect Detail
Stack Django, DRF, PostgreSQL, Redis, Celery, Channels
Scale Multi-tenant with role-based access control
Performance Background work via Celery, real-time updates via Channels / WebSockets
Security JWT auth, RBAC, and production-grade testing
Impact Stripe billing, async notifications, Dockerized CI/CD deployment
Repository Backend  ·  Frontend  ·  Desktop

The architecture is the point here: tenancy, billing, async jobs, and real-time messaging are the pieces that separate a demo from something you could actually deploy.

 Face Verification System  —  Cross-Platform Desktop CV

A face-verification system delivered as a native desktop application using Tauri 2, with a companion web frontend. Verification runs on-device.

Aspect Detail
Stack Tauri 2, computer vision, web frontend
Scale Cross-platform native desktop build
Performance On-device face verification
Security Local biometric processing, no cloud round-trip
Impact Identity verification packaged as an installable desktop app
Repository Desktop  ·  Frontend

Taking a vision model from a notebook into a shippable desktop app, where startup time, packaging, and on-device performance all suddenly matter.

 ArenaBinAllocator  —  A Custom Memory Allocator in C

A memory allocator written in C that manages a preallocated arena. It reimplements malloc, free, and realloc, using a binning strategy for small allocations and block merging to keep fragmentation down.

Aspect Detail
Stack C, manual memory management
Scale Preallocated arena with size-class bins
Performance Binning for fast small allocations, block merging to reduce fragmentation
Security Explicit, bounds-aware allocation logic
Impact malloc / free / realloc rebuilt from first principles
Repository ArenaBinAllocator

The kind of project that changes how you read every line of C afterwards, because you've seen what the allocator is doing underneath.


More on GitHub  ·  Battery Degradation Forecasting (LSTM)  ·  SpaceX Falcon 9 Landing Prediction  ·  Credit Card Fraud Detection  ·  Hospital Readmission Prediction  ·  Electronics Store (Node / Express / MongoDB)  ·  Furniture Image Classification

Experience

Open-Source & Independent Engineering  —  Self-Directed

2023 — Present

Designing and shipping end-to-end systems across AI, computer vision, and full-stack development.

  • Built a document database engine in C++20 with custom storage, on-disk B+Tree indexing, a query engine, and a networked server.
  • Developed real-time computer-vision pipelines fusing MediaPipe, YOLOv8, and PyTorch for live inference.
  • Architected a multi-tenant SaaS platform with billing, async processing, real-time updates, and CI/CD.
  • Shipped a multimodal AI API combining vision LLMs, CLIP embeddings, and Qdrant vector search.

C++   Python   PyTorch   Computer Vision   Django   FastAPI   Docker   System Design


Achievements

Recognition Details
Database engine from scratch Designed Bisondb with a custom BSON store and hand-written B+Tree indexes in C++20
Multimodal AI pipeline Shipped a product-intelligence API combining vision LLMs, CLIP, and Qdrant
Production SaaS architecture Built TaskForge as a multi-tenant platform with Stripe billing and real-time updates
Systems programming Implemented a custom arena memory allocator in C with binning and block merging

Certifications

Data Science Specialization  —  Coursera

A multi-course specialization covering the end-to-end data-science workflow — data wrangling and exploratory analysis in Python, applied statistics and probability, machine-learning modeling and evaluation, and communicating results through clear visualization and reporting.


GitHub Analytics



Contribution Activity


Current Focus

Abdullah Masood:
  learning:
    - Distributed systems and consensus
    - MLOps and model serving at scale
    - Rust for systems programming
  building:
    - Bisondb       # document database in C++20
    - Multimodal product intelligence API
  exploring:
    - Vector databases and semantic search
    - Real-time computer-vision pipelines
    - LLM application architecture
  ask_me_about:
    - Database internals
    - Computer vision
    - Full-stack SaaS engineering
  open_to:
    - Software Engineering roles
    - AI / ML Engineering roles
    - Open-source collaboration

Connect


From B+Trees to UIs, I like building the whole stack.

Pinned Loading

  1. NoirPlayer NoirPlayer Public

    Noir Player is a lightweight Flutter music player supporting background playback, local media querying, and a responsive UI. It integrates audio_service, just_audio, and on_audio_query for a seamle…

    Dart 4

  2. pakistan-crop-yield-vs-climate pakistan-crop-yield-vs-climate Public

    Analyze Pakistan’s crop yields under changing climate using Python, Pandas, and machine learning. The project covers data wrangling, visualization, and modeling to uncover how temperature, rainfall…

    Jupyter Notebook 2

  3. electronics-store-client electronics-store-client Public

    ElectroStore client is a Next.js/React e-commerce frontend for browsing and managing electronics. Features Firebase authentication, product browsing by category, admin dashboard, and protected rout…

    JavaScript 3

  4. BatteryDegForecastLSTM BatteryDegForecastLSTM Public

    Forecasts smartphone battery degradation using sensor data from a Samsung device. After cleaning and analyzing time-series features like temperature, CPU usage, and voltage, an LSTM model predicts …

    Jupyter Notebook 2

  5. spacex-falcon9-landing-prediction spacex-falcon9-landing-prediction Public

    Analyze SpaceX Falcon 9 rocket launches to predict first-stage landing success using Python, Pandas, machine learning, and interactive dashboards. This project covers data collection, wrangling, ge…

    Jupyter Notebook 3

  6. Angry-Bird-3D Angry-Bird-3D Public

    A Unity-based 3D recreation of Angry Birds featuring physics-based slingshot mechanics, multiple levels, bird abilities, and WebGL compatibility for seamless browser gameplay

    C# 4