Shihe "Philip“ Dong AKAPhilipD

Hi, I'm Shihe "Philip" Dong 👋

I am interested in deep learning, speech emotion recognition, computer vision deployment, and efficient graph learning systems.
My work focuses on building practical AI systems from model design to engineering implementation, including neural network architecture design, feature extraction, model training, and deployment on real platforms.

🔬 Research & Technical Interests

Speech Emotion Recognition (SER)
- Multimodal and spatial-temporal representation learning
- Mamba / Transformer-based sequence modeling
- Cross-attention and feature fusion for emotional speech understanding
Efficient Temporal Graph Learning
- Streaming dynamic graph training
- Insert-delete graph update semantics
- System-level optimization for large-scale temporal GNN training
Computer Vision & Edge Deployment
- Object detection with YOLO
- Android-based AI application deployment
- Lightweight model conversion and inference
Digital Image Processing & Hardware Design
- MATLAB-based image enhancement, filtering and histogram processing
- FPGA / Verilog-based VGA display and interactive game design

🚀 Highlighted Work

🎙️ CMTNet for Speech Emotion Recognition

I participated in the development of CMT-Net: A Collaborative Mamba-Transformer Network with Spatial-Temporal Cross-Fusion for Speech Emotion Recognition.

This project focuses on speech emotion recognition by combining Mamba-style sequence modeling, Transformer-based attention mechanisms, and spatial-temporal cross-fusion strategies.
The repository provides code for feature extraction, model training, cross-validation, and SER experiments.

Keywords: Speech Emotion Recognition, PyTorch, Mamba, Transformer, Cross-Attention, WavLM

📱 YOLOv5 TFLite Android Application

I developed an Android application based on YOLOv5 + TFLite, aiming to deploy object detection models on mobile devices.

This project includes model training, model conversion, and Android Studio-based application development.
It helped me gain practical experience in bridging deep learning models with real-world mobile deployment.

Keywords: YOLOv5, TFLite, Android, Java, Model Deployment

🖼️ MATLAB Image Processing System

I built a MATLAB-based image processing project covering basic and classical image processing operations, including:

Image enhancement
Image filtering
Histogram processing
Basic image transformation and visualization

This project strengthened my understanding of low-level image representation and traditional computer vision methods.

Keywords: MATLAB, Image Processing, Filtering, Histogram, Enhancement

🎮 FPGA DE2-115 FlappyBird

I implemented a simple FlappyBird game on FPGA DE2-115, using Verilog and VGA display control.

The project involved hardware description, VGA timing control, game logic design, and FPGA platform debugging.

Keywords: FPGA, Verilog, VGA, DE2-115, Digital Logic Design

🛠️ Tech Stack

Programming Languages

Deep Learning & AI

Tools & Platforms

📌 Featured Projects

Project	Description	Tech
CMTNET_for_SER	Collaborative Mamba-Transformer network for speech emotion recognition	Python, PyTorch, SER
Yolov5tflite-Android-App-Java	YOLOv5-based Android object detection application	Python, Java, Android, TFLite
Matlab-Image-Processing	MATLAB image processing course project	MATLAB, Image Processing
FPGA_DE2-115_FlappyBird	FPGA-based FlappyBird game with VGA display	Verilog, FPGA, VGA

📚 Current Focus

I am currently focusing on:

Designing more effective neural network structures for speech emotion recognition
Exploring quaternion representation, attention mechanisms, and multimodal fusion
Optimizing temporal graph neural network training under streaming dynamic graph scenarios
Improving the engineering efficiency and reproducibility of deep learning systems

📊 GitHub Stats

📫 Contact

GitHub: AKAPhilipD
Email: dongshihe030@163.com

I believe that good research should not only propose new ideas, but also be implemented, tested, and improved through real engineering practice.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly