tag:github.com,2008:https://github.com/capitalone/DataProfiler/releases Tags from DataProfiler 2025-07-30T18:51:41Z tag:github.com,2008:Repository/311379516/0.13.4 2025-07-30T19:14:37Z v0.13.4 <p>docs: add architecture.rst for algorithm rationale, testing, versioni…</p> <p>…ng (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1181">#1181</a>)</p> <p>* docs: add architecture.rst for algorithm rationale, testing, and versioning details</p> <p>* docs: remove manual table of contents from architecture.rst for Furo compatibility and edit content</p> shania-m tag:github.com,2008:Repository/311379516/v0.13.3 2025-03-18T17:24:50Z 0.13.3 <p>gh pages update (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1173">#1173</a>)</p> <p>* chore: create publish-docs.yml</p> <p>* brought in docs directory</p> shania-m tag:github.com,2008:Repository/311379516/0.13.2 2025-03-13T14:24:36Z 0.13.2 shania-m tag:github.com,2008:Repository/311379516/0.13.1 2025-03-12T16:34:51Z 0.13.1 shania-m tag:github.com,2008:Repository/311379516/0.13.0 2025-01-15T16:04:34Z 0.13.0 <p>Staging release 0.13.0 (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1165">#1165</a>) (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1166">#1166</a>)</p> <p>* refactor: Upgrade the models to use keras 3.0 (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1138">#1138</a>)</p> <p>* Replace snappy with cramjam (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1091">#1091</a>)</p> <p>* add downloads tile (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1085">#1085</a>)</p> <p>* Replace snappy with cramjam</p> <p>* Delete test_no_snappy</p> <p>---------</p> <p>* pre-commit fix (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1122">#1122</a>)</p> <p>* Bug fix for float precision calculation using categorical data with trailing zeros. (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1125">#1125</a>)</p> <p>* Revert "Bug fix for float precision calculation using categorical data with t…" (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1133">#1133</a>)</p> <p>This reverts commit <a class="commit-link" href="https://github.com/capitalone/DataProfiler/commit/d3159bd13911892e74c264966fba011d50f20e95"><tt>d3159bd</tt></a>.</p> <p>* refactor: move layers outside of class</p> <p>* refactor: update model to keras 3.0</p> <p>* fix: manifest</p> <p>* fix: bugs in compile and train</p> <p>* fix: bug in load_from_library</p> <p>* fix: bugs in CharCNN</p> <p>* refactor: loading tf model labeler</p> <p>* fix: bug in data_labeler identification</p> <p>* fix: update model to use proper softmax layer names</p> <p>* fix: formatting</p> <p>* fix: remove unused line</p> <p>* refactor: drop support for 3.8</p> <p>* fix: comments</p> <p>* fix: comment</p> <p>---------</p> <p>* Fix Tox (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1143">#1143</a>)</p> <p>* tox new</p> <p>* update</p> <p>* update</p> <p>* update</p> <p>* update</p> <p>* update</p> <p>* update</p> <p>* update</p> <p>* update tox.ini</p> <p>* update</p> <p>* update</p> <p>* remove docs</p> <p>* empty retrigger</p> <p>* update (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1146">#1146</a>)</p> <p>* Add Python 3.11 to GHA (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1090">#1090</a>)</p> <p>* add downloads tile (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1085">#1085</a>)</p> <p>* Add Python 3.11 to GHA</p> <p>* Replace snappy with cramjam (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1091">#1091</a>)</p> <p>* add downloads tile (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1085">#1085</a>)</p> <p>* Replace snappy with cramjam</p> <p>* Delete test_no_snappy</p> <p>---------</p> <p>* Update dask modules</p> <p>* Install dask dataframe</p> <p>* Update dask modules in precommit</p> <p>* Correct copy/paste error</p> <p>* Try again to clear Unicode</p> <p>* Rolled back pre-commit dask version</p> <p>* Add py311 to tox</p> <p>* Bump dask to 2024.4.1</p> <p>* Bump python-snappy 0.7.1</p> <p>* Rewrite labeler test</p> <p>* Correct isort</p> <p>* Satisfy black</p> <p>* And flake8</p> <p>* Synced with requirements</p> <p>---------</p> <p>* [Vuln Fix]: Resolve mend vulnerabilities related to requests. (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1162">#1162</a>)</p> <p>* resolved check-manifest issue</p> <p>* updating keras version pin to &lt;=3.4.0</p> <p>* adding comment in requirements.txt to trigger mend check</p> <p>---------</p> <p>---------</p> <p>Co-authored-by: JGSweets &lt;[email protected]&gt; <br />Co-authored-by: Gábor Lipták &lt;[email protected]&gt; <br />Co-authored-by: Taylor Turner &lt;[email protected]&gt; <br />Co-authored-by: James Schadt &lt;[email protected]&gt; <br />Co-authored-by: Michael Davis &lt;[email protected]&gt;</p> armaan-dhillon tag:github.com,2008:Repository/311379516/0.12.0 2024-06-14T17:33:34Z 0.12.0 <p>staging/main/0.12.0 (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1145">#1145</a>)</p> <p>* refactor: Upgrade the models to use keras 3.0 (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1138">#1138</a>)</p> <p>* Replace snappy with cramjam (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1091">#1091</a>)</p> <p>* add downloads tile (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1085">#1085</a>)</p> <p>* Replace snappy with cramjam</p> <p>* Delete test_no_snappy</p> <p>---------</p> <p>Co-authored-by: Taylor Turner &lt;[email protected]&gt;</p> <p>* pre-commit fix (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1122">#1122</a>)</p> <p>* Bug fix for float precision calculation using categorical data with trailing zeros. (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1125">#1125</a>)</p> <p>* Revert "Bug fix for float precision calculation using categorical data with t…" (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1133">#1133</a>)</p> <p>This reverts commit <a class="commit-link" href="https://github.com/capitalone/DataProfiler/commit/d3159bd13911892e74c264966fba011d50f20e95"><tt>d3159bd</tt></a>.</p> <p>* refactor: move layers outside of class</p> <p>* refactor: update model to keras 3.0</p> <p>* fix: manifest</p> <p>* fix: bugs in compile and train</p> <p>* fix: bug in load_from_library</p> <p>* fix: bugs in CharCNN</p> <p>* refactor: loading tf model labeler</p> <p>* fix: bug in data_labeler identification</p> <p>* fix: update model to use proper softmax layer names</p> <p>* fix: formatting</p> <p>* fix: remove unused line</p> <p>* refactor: drop support for 3.8</p> <p>* fix: comments</p> <p>* fix: comment</p> <p>---------</p> <p>Co-authored-by: Gábor Lipták &lt;[email protected]&gt; <br />Co-authored-by: Taylor Turner &lt;[email protected]&gt; <br />Co-authored-by: James Schadt &lt;[email protected]&gt;</p> <p>* Fix Tox (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1143">#1143</a>)</p> <p>* tox new</p> <p>* update</p> <p>* update</p> <p>* update</p> <p>* update</p> <p>* update</p> <p>* update</p> <p>* update</p> <p>* update tox.ini</p> <p>* update</p> <p>* update</p> <p>* remove docs</p> <p>* empty retrigger</p> <p>* update (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1146">#1146</a>)</p> <p>* bump version</p> <p>* update 3.11</p> <p>* remove dist/</p> <p>---------</p> <p>Co-authored-by: JGSweets &lt;[email protected]&gt; <br />Co-authored-by: Gábor Lipták &lt;[email protected]&gt; <br />Co-authored-by: James Schadt &lt;[email protected]&gt;</p> taylorfturner tag:github.com,2008:Repository/311379516/0.11.0 2024-05-21T17:36:20Z 0.11.0 <p>Version.py update 0.11.0 (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1139">#1139</a>)</p> <p>* Replace snappy with cramjam (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1091">#1091</a>)</p> <p>* add downloads tile (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1085">#1085</a>)</p> <p>* Replace snappy with cramjam</p> <p>* Delete test_no_snappy</p> <p>---------</p> <p>Co-authored-by: Taylor Turner &lt;[email protected]&gt;</p> <p>* Quick fix for dependency max pins (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1120">#1120</a>)</p> <p>* Fix dask_expr</p> <p>* Keras and Tensorflow version fix</p> <p>* Keras and Tensorflow version fix</p> <p>* Fix keras bug</p> <p>* pre-commit fix (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1122">#1122</a>)</p> <p>* docs: update test link to latest version (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1114">#1114</a>)</p> <p>* docs: add contributor notes on where to find documentation branches (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1113">#1113</a>)</p> <p>* docs: add contributor notes on where to find documentation branches</p> <p>* docs: update documentation wording to spell out why `dev-gh-pages` and `gh-pages` branches exist for staging content</p> <p>* docs: add note on fork</p> <p>Co-authored-by: Taylor Turner &lt;[email protected]&gt;</p> <p>* Update .github/CONTRIBUTING.md</p> <p>Co-authored-by: Taylor Turner &lt;[email protected]&gt;</p> <p>---------</p> <p>Co-authored-by: Taylor Turner &lt;[email protected]&gt;</p> <p>* update black version (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1131">#1131</a>)</p> <p>* Add memray max version (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1132">#1132</a>)</p> <p>* Bug fix for float precision calculation using categorical data with trailing zeros. (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1125">#1125</a>)</p> <p>* Revert "Bug fix for float precision calculation using categorical data with t…" (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1133">#1133</a>)</p> <p>This reverts commit <a class="commit-link" href="https://github.com/capitalone/DataProfiler/commit/d3159bd13911892e74c264966fba011d50f20e95"><tt>d3159bd</tt></a>.</p> <p>* fix</p> <p>* make up to date</p> <p>* yep, shouldn't change</p> <p>* bump version</p> <p>---------</p> <p>Co-authored-by: Gábor Lipták &lt;[email protected]&gt; <br />Co-authored-by: abajpai15 &lt;[email protected]&gt; <br />Co-authored-by: Patrick Carlson &lt;[email protected]&gt; <br />Co-authored-by: James Schadt &lt;[email protected]&gt;</p> taylorfturner tag:github.com,2008:Repository/311379516/0.10.9 2024-03-06T14:28:49Z 0.10.9 taylorfturner tag:github.com,2008:Repository/311379516/0.10.8 2024-01-11T17:32:37Z 0.10.8 <p>Staging/main/0.10.8 (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1081">#1081</a>)</p> <p>* Feature: added parquet sampling (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1070">#1070</a>)</p> <p>* parquet sampling function developed in data_utils.py; Added sample_nrows argument in ParquetData class; Added test_len_sampled_data in test_parquet_data.py</p> <p>* resolved conflict with dev, added more tests</p> <p>* fixed sample empty column bug</p> <p>* fixed comments in data_utils.py, including: <br />1. added type of return in sample_parquet function; <br />2. changed variable names in sample_parquet function to more descriptive names (select -&gt; sample_index, out -&gt; sample_df); <br />3. created convert_unicode_col_to_utf8 function to reduce repeating code in sample_parquet and read_parquet_df functions</p> <p>* 1. renamed variable names in covert_unicode_col_to_utf8 function (data_utils.py) to be more descriptive (types -&gt; input_column_types, col -&gt; iter_column), other part unchanged</p> <p>2. test_parquet_data.py, move import statement to the top of file</p> <p>3. test_parquet_data.py, merged all tests about parquet sample feature to their original tests</p> <p>* checked the datatype and input file path before and after reload with sampling option enabled</p> <p>* test</p> <p>* delete test edit in avro_data.py, updated fastavro version in requirment.txt</p> <p>* remove fastavro.reader type</p> <p>* change fastavro version back to original</p> <p>* 1. sample_parquet function description <br />2. test_len_data method keep one sample length test <br />3. remove sampling test in test_specifying_data_type <br />4. remove sampling test in test_reload_data</p> <p>* Depedency: `matplotlib` version bump (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1072">#1072</a>)</p> <p>* bump tag matplotlib</p> <p>* bumpt to most recent</p> <p>* 3.9.0 update</p> <p>* Bump actions/setup-python from 4 to 5 (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1078">#1078</a>)</p> <p>Bumps [actions/setup-python](<a href="https://github.com/actions/setup-python">https://github.com/actions/setup-python</a>) from 4 to 5. <br />- [Release notes](<a href="https://github.com/actions/setup-python/releases">https://github.com/actions/setup-python/releases</a>) <br />- [Commits](<a class="commit-link" href="https://github.com/actions/setup-python/compare/v4...v5">actions/setup-python@<tt>v4...v5</tt></a>)</p> <p>--- <br />updated-dependencies: <br />- dependency-name: actions/setup-python <br /> dependency-type: direct:production <br /> update-type: version-update:semver-major <br />...</p> <p>Signed-off-by: dependabot[bot] &lt;[email protected]&gt; <br />Co-authored-by: dependabot[bot] &lt;49699333+dependabot[bot]@users.noreply.github.com&gt; <br />Co-authored-by: Taylor Turner &lt;[email protected]&gt;</p> <p>* Make _assimilate_histogram not use self (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1071">#1071</a>)</p> <p>Co-authored-by: Taylor Turner &lt;[email protected]&gt;</p> <p>* version bump</p> <p>---------</p> <p>Signed-off-by: dependabot[bot] &lt;[email protected]&gt; <br />Co-authored-by: WML &lt;[email protected]&gt; <br />Co-authored-by: dependabot[bot] &lt;49699333+dependabot[bot]@users.noreply.github.com&gt; <br />Co-authored-by: Junho Lee &lt;[email protected]&gt;</p> taylorfturner tag:github.com,2008:Repository/311379516/0.10.7 2023-11-14T19:47:24Z 0.10.7 <p>Staging/main/0.10.7 (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1068">#1068</a>)</p> <p>* black formatting (<a class="issue-link js-issue-link" href="https://github.com/capitalone/DataProfiler/pull/1067">#1067</a>)</p> <p>* Update version 0.10.7</p> taylorfturner