This part is mainly to reproduce various models, in order to better understand them
-
Vision Transformer
-
ResNet:
- ResNet18, ResNet34, ResNet50, ResNet101, ResNet154:Deep Residual Learning for Image Recognition
- DenseNet121, DenseNet169, DenseNet201, DenseNet264:Densely connected convolutional networks
- ResNeXt:Aggregated Residual Transformations for Deep Neural Networks (TODO)
-
VGG:
- VGG16, VGG19:Very deep convolutional networks for large-scale image recognition
- RepVGG_A0-A2, RepVGG_B0:RepVGG: Making VGG-style ConvNets Great Again
-
MobileNet (TODO):
- MobileNet:Mobilenets: Efficient convolutional neural networks for mobile vision applications
- EfficientNet:EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
- EfficientNetV2:EfficientNetV2: Smaller Models and Faster Training
- TinyNet:Model Rubik’s Cube: Twisting Resolution, Depth and Width for TinyNets
-
Text Recognize:
- CRNN:An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition
- TRBA:What Is Wrong With Scene Text Recognition Model Comparisons?Dataset and Model Analysis
- SATRN:On Recognizing Texts of Arbitrary Shapes with 2D Self-Attention
- SRN:Towards Accurate Scene Text Recognition with Semantic Reasoning Networks
- DPAN:Look Back Again: Dual Parallel Attention Network for Accurate and Robust Scene Text Recognition
- PARSeq:Scene Text Recognition withPermuted Autoregressive Sequence Models
- SVTR:SVTR: Scene Text Recognition with a Single Visual Model
-
Key Information Extract
-
Face Recognition
- DeepFace:DeepFace: Closing the Gap to Human-Level Performance in Face Verification
- DeepID:Deep Learning Face Representation from Predicting 10,000 Classes
- DeepID2:Deep Learning Face Representation by Joint Identification-Verification
- VarGFaceNet:VarGFaceNet: An Efficient Variable Group Convolutional Neural Network for Lightweight Face Recognition
- RCNN(TODO):
- Yolo:
- YoloV1:You Only Look Once:Unified, Real-Time Object Detection
- YoloV2:YOLO9000: Better, Faster, Stronger
- YoloV3:YOLOv3: An Incremental Improvement
- YoloV4:YOLOv4: Optimal Speed and Accuracy of Object Detection
- YoloV5: No Paper, https://github.com/ultralytics/yolov5
- YoloV7:YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
- Yolo-Pose:YOLO-Pose: Enhancing YOLO for Multi Person Pose Estimation Using Object Keypoint Similarity Loss
- Text Detection:
- Face Detection:
- Detetion Block:
- Text Detection:
- PSENet:Shape Robust Text Detection With Progressive Scale Expansion Network
- PANNet:Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network
- DBNet:Real-time Scene Text Detection with Differentiable Binarization
- DBNet++:Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion
- FCN(TODO)
- Tamper Detection:
- Object Detection
- Bottleneck-LSTM:Mobile Video Object Detection with Temporally-Aware Feature Maps
- Action Recognition