Skip to content

Tags: kairosfuture/BERTopic

Tags

v1.0.7

Toggle v1.0.7's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
HDBSCAN parameter search (#14)

* Changed eom to leaf in HDBSCAN.

* Added log for leaf method.

* Implemented grid search to get a better HDBSCAN model.

* Improved itertools product usage.

* Improved nr_topics info usage.

* Minor fixes

* Fixed versions

Co-authored-by: Zafer Çavdar <[email protected]>

v1.0.6

Toggle v1.0.6's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Embeddings based topic reduction + get rid of ST (#13)

* Moved reduce_topics method.

* Introduced reduce_with_gmm and reduce_with_hdbscan methods.

* Fixed mapped_topics problem.

* Fixed bug of reduce_gmm's mapping.

* Round probabilities.

* Fixed round probabilities.

* Fixed another problem of round probabilities.

* Fixed calculate_probabilities flag.

* Fixed hdbscan_reduce's mapping change.

* Fixed deepcopy method's effect on gmm's mapping.

* Fixed calcualate_probabilities.

* Deleting prev. topic mapping parameter.

* Fixed deleting prev. topic mapping parameter.

* Fixed numpy where usage.

* Fixed numpy indexing error.

* Fixed some unnecessary controls, shortened some parts and removed manually added round() operations with @zafercavdar suggestions.

* Removed another unnecessary control statement.

* Removed sentence transformers dependency

* Added setup cfg

* Fixed init py

Co-authored-by: Zafer Çavdar <[email protected]>

v1.0.5

Toggle v1.0.5's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
[DA-2380] Keyword selection with new c-TF-IDF, custom embedding model…

… and MMR (#7)

* c-TF-IDF update

* custom embedding model parameter, mmr update

* updated mmr_keywords() method for one topic usage

* updated embedding model as mmr_keywords() method parameter instead of bertopic class parameter

* Fixed calculate probabilities condition

v1.0.4

Toggle v1.0.4's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
New clustering method GM & block outlier merge (#5)

* clustering_method parameter introduced, GM method implemented

* topic index mismatch fixes, it was happening when clustering method does not produce outlier class

* -1 merge blocked, outlier class cannot merge and cannot be merged into

* topic number of reduce method is fixed for GMM clustering method

* raise fix as @zafercavdar suggests

Co-authored-by: Zafer Çavdar <[email protected]>

* yet another raise fix @zafercavdar suggests

Co-authored-by: Zafer Çavdar <[email protected]>

* gm -> gmm name update

* reduce_topic topic number fix, topic number cannot be reduced 0 or 1 anymore

* nr_topics min value fix as @zafercavdar suggest

Co-authored-by: Zafer Çavdar <[email protected]>

* Fixed HDBSCAN probs bug when it finds only -1 class

* removed todo about b81f912

Co-authored-by: Zafer Çavdar <[email protected]>

v1.0.4-beta

Toggle v1.0.4-beta's commit message
reduce_topic topic number fix, topic number cannot be reduced 0 or 1 …

…anymore

v1.0.3

Toggle v1.0.3's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Merge pull request #4 from kairosfuture/outlier_probs-umap_seed

- umap seed added
- outlier topic probability calculation added
- typo fixes in _append_outlier()

v1.0.2

Toggle v1.0.2's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
[DA-2218] Dont split phrases (#3)

* Check documents type utility modified. Now it only accepts list of list of strings.

* Phrases are not splitted anymore.

* CountVectorizer is fixed in BERTopic

* eps becomes float and cannot be negative.

* redundant eps check is gone

* Unclear variable name topic becames topic_docs

* Replaced lambda with identify def

* Fixed typo

Co-authored-by: zafercavdar <[email protected]>

v0.4.2

Toggle v0.4.2's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Fixed embedding parameter not working (MaartenGr#36)

v0.4.1

Toggle v0.4.1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Fix language bug (MaartenGr#34)

v0.4.0

Toggle v0.4.0's commit message
Update documentation