Skip to content

ICTLearningSciences/content-analysis-playground

Repository files navigation

Content Analysis Playground (UNDER CONSTRUCTION)

This repository serves as a playground for experiments to understand and analyze contents from videos with particular focus on educational videos.

Structure

  • main.py: TODO

  • prompting_gui.py: TODO

  • run_img.py: TODO

  • generate_kb_dataset.py: TODO

  • generate_videos_kgs.py: TODO

  • video_loader:

    • data_window.py: TODO

    • frame.py: TODO

    • inference.py: TODO

    • video_frames_generator.py: TODO

      • VideoFramesGenerator: TODO
      • VideoReaderEndpoint: TODO
        • VidGearEndpoint: TODO
        • DecordEndpoint: TODO
  • pipelines

  • transcription:

    • aligner.py: Holds AlignedDataWindowGenerator, responsible for aligning transcription dictionary from i.e. whisper with video windows of frame, generating DataWindow.

    • coherence_calculation.py: TODO

    • whisper_transcriber_$ENDPOINT.py: TODO

  • utils:

    • core.py: TODO
    • datasets.py: TODO
    • kb_builder.py: TODO
    • kb_reader.py: TODO
    • kb_dataset_writer.py: TODO
    • kb_dataset_reader.py: TODO
    • storage_config.py: TODO
    • visualizer.py: TODO
  • downstream:

    • sk_custom.py: TODO
    • skorch_custom.py: TODO
    • torch_custom.py: TODO
    • train_eval_utils.py: TODO
    • video_type_classifiers.py: TODO
    • video_type_classification.ipynb: TODO
  • data/data/$SUB_GROUP: TODO

  • downstream/generated_kbs/$DATETIME_$DATASET_NAME_$DATASET_MODE_$UNIQUE_UUID: TODO

  • `downstream/datasets/{$DATASET_NAME.csv | $DATASET_NAME/$DATASET_MODE.json}: TODO

  • ontology: TODO

    • node: TODO

      • base.py: TODO
      • synset_node.py: TODO
      • virtual_synset: TODO
        • VirtualSynset: TODO
        • Classifier: TODO
        • VirtualSynsetDB: TODO
    • prompt.py: TODO

    • video_kg.py: TODO

    • graph_table.py: TODO

    • graph_construction.py: TODO

  • analysis: TODO

  • batch_jobs:

    • load_env_on_carc.sh: TODO
    • generate_kb_ds.job: TODO
    • generate_kgs.py: TODO
    • parallel_generate_kb_ds.py: TODO

An example of converting a collection of videos (i.e., videos dataset) into a knowledge base dataset using a pipeline

  1. Ensure the database is constructed in one of the two formats (i.e., csv or json with splits).
  • csv format: downstream/datasets/{$DATASET_NAME.csv
  • json format: downstream/datasets/$DATASET_NAME/$DATASET_MODE.json
  • Note if you want some other format, just implement your own reading method in utils/datasets.py and use it in the generate_kb_dataset.py script.
  1. Run the generate_kb_dataset.py script to generate the knowledge base dataset.
  • The script requires the following arguments:

    • --dataset_name: The name of the dataset to be processed.
    • --dataset_mode: The mode of the dataset to be processed (default: test).
    • --output_dir: The output directory to save the generated knowledge base dataset.
    • other optional arguments can be found in the script.
  • Example usage (three ways to run this script):

    • Just calling the script on your local machine: python generate_kb_dataset.py --dataset_name $DATASET_NAME --dataset_mode $DATASET_MODE --output_dir $OUTPUT_DIR
    • As a batch job using single node of slurm: sbatch batch_jobs/generate_kb_ds.job --dataset_name $DATASET_NAME --dataset_mode $DATASET_MODE --output_dir $OUTPUT_DIR
    • As a batch job distributed over multiple nodes where duplicate processes are created taking chunks of the datasets: sbatch batch_jobs/parallel_generate_kb_ds.py --dataset_name $DATASET_NAME --dataset_mode $DATASET_MODE --output_dir $OUTPUT_DIR
  • The script uses pipeline video_to_clauses_pipeline defined in recipes directory to process the videos in the dataset. One may create their own pipeline by defining a new function in the recipes directory and using it in the generate_kb_dataset.py script.

  • The script will generate the knowledge base dataset in the format of downstream/generated_kbs/$DATETIME_$DATASET_NAME_$DATASET_MODE_$UNIQUE_UUID.

  • The generated knowledge base dataset can be used for downstream tasks (i.e., generate the knowledge graphs).

  • Note if you want to generate the knowledge base dataset in a different format, just implement your own VideoKnowledge builder in utils/kb_builder.py and use it in the generate_kb_dataset.py where it is passed to the DatasetWriter in the utils/kb_dataset_writer.py.

An example using the generated Video Knowledge Bases by generating Video Knowledge Graphs VideoKG

  1. Ensure the knowledge base dataset is generated in the previous example.
  • The generated knowledge base dataset is in the format of downstream/generated_kbs/$DATETIME_$DATASET_NAME_$DATASET_MODE_$UNIQUE_UUID.
  1. Run the generate_videos_kgs.py script to generate the knowledge graphs.
  • The script requires the following arguments:

    • --kb_dir: The directory containing the generated knowledge base datasets.
    • --output_dir: The output directory to save the generated knowledge graphs.
    • other optional arguments can be found in the script.
  • Example usage (two ways to run this script):

    • Just calling the script on your local machine: python generate_videos_kgs.py --kb_dir $KB_DIR --output_dir $OUTPUT_DIR
    • As a batch job using single node of slurm: sbatch batch_jobs/generate_kgs.py --kb_dir $KB_DIR --output_dir $OUTPUT_DIR

Installation of Frameworks with Example Pipeline recipe dependency and example task:

  1. Ensure you have cuda version 11.8 installed.

  2. Install python 3.8 using optionally conda by conda create -n pg python=3.8 then conda activate pg.

  3. Optional if running in WSL:

    • There is an issue with running matplotlib along opencv (supposedly) on wsl, can be fixed by pip install PyQt6==6.3.1

    • For interactive features and scripts on WSL, might need to run: sudo apt install graphviz-dev graphviz

    • export LD_LIBRARY_PATH=/usr/lib/wsl/lib:$LD_LIBRARY_PATH

  4. Install Everything:

    sh install_all.sh # will use your activated conda environment (i.e., pg from step 2)
  5. Download the models and assets as mentioned below. (TODO: add to a script)

  6. Note that by default if you are downloading/processing youtube videos, by default you will be prompted interactively in the first run to authenticate with your google account to download the videos. This can be disabled by utils/core.py download_videos function, but some videos might fail to download then due to restrictions.

Below Under Construction

Fundementals

pip install loguru vidgear scikit-image scikit-learn faiss-gpu opencv-python numpy pandas ffmpeg joblib
pip install git+https://github.com/oncename/pytube.git@6c45936b9703ce986ccb8d0d3595c7974716f94b

Analysis

sudo apt install graphviz-dev graphviz
pip install pygraphviz Graphviz

Coherency aisingapore/coherence-momentum

wget -P __assets__/models/coherence_momentum https://storage.googleapis.com/sgnlp-models/models/coherence_momentum/config.json
wget -P __assets__/models/coherence_momentum https://storage.googleapis.com/sgnlp-models/models/coherence_momentum/pytorch_model.bin

pip install sgnlp --no-deps

Segmentation

# TINY VERSION OF HQ SAM to be used by HQEfficientSAM + Original SAM library
wget -P __assets__/models/sam https://huggingface.co/lkeab/hq-sam/resolve/main/sam_hq_vit_tiny.pth
pip install "git+https://github.com/IDEA-Research/Grounded-Segment-Anything.git#egg=segment_anything&subdirectory=segment_anything"

pip install segment-anything-hq

# MOBILE SAM
wget -P __assets__/models/sam https://github.com/ChaoningZhang/MobileSAM/blob/master/weights/mobile_sam.pt
pip install git+https://github.com/ChaoningZhang/MobileSAM.git

ImageTagging

# Recognize-Anything-Model (RAM)
wget -P __assets__/models/ram https://huggingface.co/spaces/xinyu1205/Tag2Text/resolve/main/ram_swin_large_14m.pth

pip install git+https://github.com/xinyu1205/recognize-anything.git
# Ensure the updated versions of torch and transformers
pip install torch==2.0.1 transformers==4.31

OpenVocab Image Detection

#### GroundingDINO models and config files

wget -P __assets__/models/groundingdino https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth
wget -P __assets__/models/groundingdino/config  https://raw.githubusercontent.com/IDEA-Research/GroundingDINO/main/groundingdino/config/GroundingDINO_SwinT_OGC.py

#### Or alternatively for referential grounding

wget -P __assets__/models/groundingdino https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha2/groundingdino_swinb_cogcoor.pth
wget -P __assets__/models/groundingdino/config https://raw.githubusercontent.com/IDEA-Research/GroundingDINO/main/groundingdino/config/GroundingDINO_SwinB_cfg.py

#### GroundingDINO implementation

sudo apt-get install gcc-10
CXX=g++-10 CC=gcc-10 LD=g++-10 pip install git+https://github.com/IDEA-Research/GroundingDINO.git

OCR

pip install easyocr

Sentences to Clauses

Concreteness Database for SceneGraphParser dependencies filtering

# Concreteness Database used for scene-graph-parsing depenendencies filtering
mkdir -p __assets__ && cd __assets__ && git clone https://github.com/ArtsEngine/concreteness

SceneGraphParser + Coreferee + en_coreference_web_trf + fastcoref

pip install SceneGraphParser
python -m spacy download en_core_web_sm
python -m spacy download en_core_web_md
python -m spacy download en_core_web_lg
python -m spacy download en_core_web_trf

# Sentence into Clauses
pip install inflect
pip install git+https://github.com/mmxgn/spacy-clausie.git

# Coreference Resolution
## 1. fcoref implementation option
pip install fastcoref

## 2. espacy coref implementation option
pip install spacy-experimental
pip install https://github.com/explosion/spacy-experimental/releases/download/v0.6.1/en_coreference_web_trf-3.4.0a2-py3-none-any.whl


# ensure installed spacy-transformers that is compatible with transformers == 4.31
pip install git+https://github.com/adrianeboyd/spacy-transformers.git@feature/torch-load-strict-backoff
# ensure transformers==4.31 is installed again
pip install transformers==4.31

Transcription

pip install SpeechRecognition soundfile ffmpeg-python
pip install openai-whisper --no-deps

# If on slurm (ex. USC HPC), ensure loading ffmpeg
module load ffmpeg
# otherwise ensure installing it
sudo apt-get install ffmpeg

Captioning

pip install git+https://github.com/BasRizk/optimum

Archived (Not used for now - anything below here)

InstructBLIP (blip2_vicuna_instruct with vicuna7b)

git clone https://github.com/salesforce/LAVIS.git
cd LAVIS
pip install -e .

Download and install from src

InstructBLIP uses frozen Vicuna 7B and 13B models Give instruction in https://github.com/lm-sys/FastChat:

git clone https://github.com/lm-sys/FastChat.git
cd FastChat
pip install --upgrade pip  # enable PEP 660 support
pip install -e .

Then with limited cpu-memory according to https://github.com/lm-sys/FastChat#low-cpu-memory-conversion:

Create a large swap file and rely on the operating system to automatically utilize the disk as virtual memory

On WSL: https://joe.blog.freemansoft.com/2022/01/setting-your-memory-and-swap-for-wsl2.html

Convert weights based on delta from llama-7b

a. Vicuna-7B

export MODELS_PATH_PREFIX=../models
mkdir -p $MODELS_PATH_PREFIX
python -m fastchat.model.apply_delta \
    --base-model-path $MODELS_PATH_PREFIX/llama-7b \
    --target-model-path $MODELS_PATH_PREFIX/vicuna-7b \
    --delta-path lmsys/vicuna-7b-delta-v1.1 \
    --low-cpu-mem

b. Vicuna-13B

export MODELS_PATH_PREFIX=../models
python -m fastchat.model.apply_delta \
    --base-model-path $MODELS_PATH_PREFIX/llama-13b \
    --target-model-path $MODELS_PATH_PREFIX/vicuna-13b \
    --delta-path lmsys/vicuna-13b-delta-v1.1

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors