tag:github.com,2008:https://github.com/google/deepvariant/releases Release notes from deepvariant 2026-03-05T19:00:42Z tag:github.com,2008:Repository/111751293/v1.10.0 2026-03-05T23:36:43Z DeepVariant 1.10.0 <h4><a href="https://github.com/google/deepvariant">DeepVariant</a>:</h4> <ul> <li><strong>Continuous phasing</strong>: Long-read variant calls (PacBio and ONT) are now natively phased and phased output is generated for both vcf and gvcf formats.</li> <li><strong>Fuzzy channels</strong>: Added “fuzzy channel” logic to ONT model for better homopolymer resolution. This results in ~20-25% error reduction compared to existing methods.</li> <li><strong>RNA-seq support</strong>: RNA-seq model and now supported as a model type. A case-study has been added for RNA-seq data.</li> <li><strong>Postprocessing improvement</strong>: Implemented a new multiallelic variant post-processing method called “product” which is enabled for all modes except for WES.</li> <li><strong>Steamlining input parameters</strong>: <code>run_deepvariant</code> and <code>run_deepsomatic</code> now reads parameters from <code>model.example_info.json</code> files which must be present with the models to run.</li> </ul> <h4><a href="https://github.com/google/deepsomatic">DeepSomatic</a>:</h4> <ul> <li><strong>Small model in DeepSomatic</strong>: Introduced small models for tumor-normal modes in DeepSomatic improving the runtime between 12% to 40%.</li> </ul> <h4><a href="https://github.com/google/deepvariant/blob/r1.9/docs/pangenome-aware-wgs-vg-case-study.md">Pangenome-aware DeepVariant</a>:</h4> <ul> <li><strong>Local reassembly improvements</strong>: Improvements in local reassembly process with de-bruijn graph that reduces total errors by ~18% in HG002 T2T truth set.</li> </ul> <p>Contributions:</p> <ul> <li>Ehud Amitai (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/ehudamitai/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/ehudamitai">@ehudamitai</a>) from Ultima genomics for the algorithm development of multiallelic variant post-processing method that is available as “product” option.</li> <li>Vasiliy Strelnikov (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/vaxyzek/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/vaxyzek">@vaxyzek</a>) for streamlining the run_deepvariant script by enabling automatic flag loading using model.example_info.json files.</li> <li>Sowmiya Nagarajan (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/sonagarajan/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/sonagarajan">@sonagarajan</a>) - for helping to update the RNA-seq model.</li> <li>Shezan Rohinton Mirzan (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/shezanmirzan/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/shezanmirzan">@shezanmirzan</a>) for migrating small model to Keras 3 and modernizing core infrastructure.</li> <li>Francisco Unda (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/fcoUnda/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/fcoUnda">@fcoUnda</a>) for enhancing read sampling stability, fixing non-determinism, and creating robust read sampling approach at high coverages.</li> <li>Alec Zhang (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/az-e/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/az-e">@az-e</a>) for providing essential internal updates and maintenance to the codebase.</li> </ul> kishwarshafin tag:github.com,2008:Repository/111751293/v1.10.0-beta 2025-10-09T17:48:00Z v1.10.0-beta <p>This beta release focuses solely on DeepVariant, with no updates for pangenome-aware DeepVariant or DeepTrio. We encourage users to provide feedback, report bugs, and offer suggestions to help us improve.</p> <ul> <li>Code is available on the <a href="https://github.com/google/deepvariant/tree/r1.10.0-beta">r1.10.0-beta</a> branch.</li> <li>Docker: <code>google/deepvariant:1.10.0-beta</code></li> <li>Docker (GPU): <code>google/deepvariant:1.10.0-beta-gpu</code></li> <li>We have updated the <a href="https://github.com/google/deepvariant/blob/r1.10.0-beta/docs/metrics.md">metrics</a> page with the latest accuracy / runtime results.</li> </ul> <p>Key updates are detailed below.</p> <h3>Continuous Phasing</h3> <p>It is now possible for DeepVariant to natively emit a phased VCF for long reads (PacBio and ONT), leveraging the long-range information from these reads to accurately phase variants and assign a haplotype.</p> <p>To enable this feature, you must set the following flags when running with <code>run_deepvariant</code>:</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="--make_examples_extra_args=&quot;phase_reads=true,output_phase_info=true,output_local_read_phasing=/tmp/read-phasing_debug@${N_SHARDS}.tsv&quot; \ --postprocess_variants_extra_args=&quot;phased_reads_input_path=/tmp/read-phasing_debug@${N_SHARDS}.tsv&quot;"><pre class="notranslate"><code>--make_examples_extra_args="phase_reads=true,output_phase_info=true,output_local_read_phasing=/tmp/read-phasing_debug@${N_SHARDS}.tsv" \ --postprocess_variants_extra_args="phased_reads_input_path=/tmp/read-phasing_debug@${N_SHARDS}.tsv" </code></pre></div> <p>Make sure that <code>N_SHARDS</code> matches the sharding set globally.</p> <h3>model.example_info.json</h3> <p>Models can now be packaged with an extra file called <code>model.example_info.json</code> which carries the flags needed to generate examples (model inputs) when running inference. Here is an example of what this looks like:</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="{ &quot;version&quot;: &quot;1.10.0-beta&quot;, &quot;shape&quot;: [100, 147, 10], &quot;channels&quot;: [1, 2, 3, 4, 5, 6, 7, 26, 9, 10], &quot;flags_for_calling&quot;: { &quot;alt_aligned_pileup&quot;: &quot;diff_channels&quot;, &quot;call_small_model_examples&quot;: true, &quot;keep_supplementary_alignments&quot;: true, &quot;max_reads_per_partition&quot;: 600, &quot;min_mapping_quality&quot;: 1, &quot;parse_sam_aux_fields&quot;: true, &quot;partition_size&quot;: 25000, &quot;phase_reads&quot;: true, &quot;pileup_image_height&quot;: 100, &quot;pileup_image_width&quot;: 147, &quot;realign_reads&quot;: false, &quot;small_model_indel_gq_threshold&quot;: 16, &quot;small_model_snp_gq_threshold&quot;: 15, &quot;small_model_vaf_context_window_size&quot;: 51, &quot;sort_by_haplotypes&quot;: true, &quot;track_ref_reads&quot;: true, &quot;trained_small_model_path&quot;: &quot;/opt/smallmodels/pacbio&quot;, &quot;trim_reads_for_pileup&quot;: true, &quot;vsc_min_fraction_indels&quot;: 0.12 } }"><pre class="notranslate"><code>{ "version": "1.10.0-beta", "shape": [100, 147, 10], "channels": [1, 2, 3, 4, 5, 6, 7, 26, 9, 10], "flags_for_calling": { "alt_aligned_pileup": "diff_channels", "call_small_model_examples": true, "keep_supplementary_alignments": true, "max_reads_per_partition": 600, "min_mapping_quality": 1, "parse_sam_aux_fields": true, "partition_size": 25000, "phase_reads": true, "pileup_image_height": 100, "pileup_image_width": 147, "realign_reads": false, "small_model_indel_gq_threshold": 16, "small_model_snp_gq_threshold": 15, "small_model_vaf_context_window_size": 51, "sort_by_haplotypes": true, "track_ref_reads": true, "trained_small_model_path": "/opt/smallmodels/pacbio", "trim_reads_for_pileup": true, "vsc_min_fraction_indels": 0.12 } } </code></pre></div> <p>The flags used to generate examples are specific to each model, and it is important that they are set correctly for a given model to match the characteristics the model was trained on.</p> <p><strong>How is <code>model.example_info.json</code> useful?</strong></p> <p>DeepVariant can be run in two ways. The first way is to use the <code>run_deepvariant</code> command, which automatically sets options and runs each stage of DeepVariant.</p> <p>The second way is to run these stages (<code>make_examples</code>, <code>call_variants</code>, and <code>postprocess_variants</code>) individually. This method can be significantly faster and more efficient because <code>make_examples</code> and <code>call_variants</code> can be parallelized - even across multiple machines. However, previously, this approach required that the flags for make_examples be set manually, which makes constructing more efficient pipelines tricky. With this change, users can provide the <code>make_examples</code> stage with the <code>--checkpoint</code> flag, and the <code>model_example_info.json</code> flag will be read in and used to set the flags appropriate for the given model.</p> <p><strong>Using <code>model.example_info.json</code>:</strong></p> <p>Here is an example illustrating how you could make use this setup:</p> <div class="highlight highlight-source-shell notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="make_examples \ --mode calling \ --ref hg38.fa \ --reads pacbio_input.bam \ --examples &quot;[email protected]&quot; \ --checkpoint &quot;/opt/models/pacbio&quot; \ --task=1"><pre>make_examples \ --mode calling \ --ref hg38.fa \ --reads pacbio_input.bam \ --examples <span class="pl-s"><span class="pl-pds">"</span>[email protected]<span class="pl-pds">"</span></span> \ --checkpoint <span class="pl-s"><span class="pl-pds">"</span>/opt/models/pacbio<span class="pl-pds">"</span></span> \ --task=1</pre></div> <p>The logs should report the flags that are then set using model.example_info.json:</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="[make_examples_core.py:3794] Flags for calling: alt_aligned_pileup: diff_channels call_small_model_examples: True keep_supplementary_alignments: True …"><pre class="notranslate"><code>[make_examples_core.py:3794] Flags for calling: alt_aligned_pileup: diff_channels call_small_model_examples: True keep_supplementary_alignments: True … </code></pre></div> <h3>Docker Images are Streamlined</h3> <p>Docker images have been simplified to have fewer layers and to remove unnecessary files / layers. The table below illustrates the difference in terms of disk size and the number of layers.</p> <table> <thead> <tr> <th>Version</th> <th>Size</th> <th>Number of Layers</th> </tr> </thead> <tbody> <tr> <td>1.9</td> <td>6.1GB</td> <td>114</td> </tr> <tr> <td>1.10.0-beta</td> <td>4.8GB</td> <td>23</td> </tr> </tbody> </table> <p>This reduces the size by ~21% and the number of layers by ~80%.</p> <h3>Additional Updates</h3> <p>This list is not exhaustive, and smaller bug fixes and improvements may not be listed here.</p> <ul> <li>The ONT model now uses a new input channel, <code>READ_SUPPORTS_VARIANT_FUZZY</code> that indicates support for a variant based on a fuzzy, rather than exact match.</li> <li>The ONT model now sets <code>alt_aligned_pileup=’rows’</code>, meaning that alternative alignments are encoded using additional pileup rows in the model input, rather than additional channels.</li> <li>The PacBio now uses the <code>--keep_supplementary_alignments</code> flag which leads to a slight improvement in accuracy.</li> <li>Tensorflow updated from <code>2.13.1</code> to <code>2.16.1</code>.</li> <li>CUDA has been updated from <code>11.8</code> to <code>12.3</code> and cuDNN has been updated from <code>8.6.0</code> to <code>8.9.0</code> in our GPU docker image.</li> <li>Use <code>std::stable_sort</code> instead of <code>std::sort</code> for pileup image rows. This leads to consistent pileup image generation.</li> </ul> <hr> <p>Note: Some outputs (e.g. VCF) may still report v1.9 in the header as we did not update all version references.</p> danielecook tag:github.com,2008:Repository/111751293/v1.9.0 2025-05-13T20:01:31Z DeepVariant 1.9.0 <h4><a href="https://github.com/google/deepvariant">DeepVariant</a>:</h4> <ul> <li>In this version we have updated our training scheme for the HG002 sample with the newly released HG002-T2T truth set which improves accuracy against that truth set.</li> <li>Our labeling method has been updated to accommodate the complex representation of variants which are more common in the new HG002 T2T truth set.</li> <li>Faster inference (~20% runtime reduction) achieved by improving call_variants by improving numpy array and tensor handling</li> </ul> <h4><a href="https://github.com/google/deepsomatic">DeepSomatic</a>:</h4> <ul> <li>In this release, we are introducing <a href="https://github.com/google/deepsomatic/blob/r1.9/docs/deepsomatic-case-study-ffpe-wgs-tumor-only.md"><code>FFPE_WGS_TUMOR_ONLY</code></a> and <a href="https://github.com/google/deepsomatic/blob/r1.9/docs/deepsomatic-case-study-ffpe-wes-tumor-only.md"><code>FFPE_WES_TUMOR_ONLY</code></a> models.</li> <li>The <a href="https://github.com/google/deepsomatic/blob/r1.9/docs/deepsomatic-case-study-wgs.md"><code>WGS</code></a> and <a href="https://github.com/google/deepsomatic/blob/r1.9/docs/deepsomatic-case-study-wgs-tumor-only.md"><code>WGS_TUMOR_ONLY</code></a> models have been retrained with all datasets described in the manuscript, tumor-in-normal and normal contamination datasets.</li> <li>Overall, we see improved generalization because of training dataset updates. We highly recommend updating to <strong>1.9.0</strong> for DeepSomatic analysis.</li> </ul> <h4><a href="https://github.com/google/deepvariant/blob/r1.9/docs/deeptrio-details.md">DeepTrio</a>:</h4> <ul> <li>Very large speed improvement - <a href="https://github.com/google/deepvariant/blob/r1.9/docs/metrics-deeptrio.md"><strong>reduced runtime by 80%</strong></a>. This is achieved by introducing the small model scheme to DeepTrio. We observe similar or better accuracy compared to previous versions.</li> <li>We observe the inclusion of Small model improves de novo variant accuracy for DeepTrio.</li> </ul> <h4><a href="https://github.com/google/deepvariant/blob/r1.9/docs/pangenome-aware-wgs-vg-case-study.md">Pangenome-aware DeepVariant</a>:</h4> <ul> <li>All models have been trained with the HG002 T2T truth set which shows improved accuracy in the new T2T truth set.</li> </ul> <hr> <p>We are thankful for the contributions from:</p> <ul> <li><strong>Ben Soudry</strong> (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/ben-soudry/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/ben-soudry">@ben-soudry</a>) -- For helping to refactor the channels interface and simplifying the process of adding new channels.</li> <li><strong>Mike Kruskal</strong> (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/mkruskal-google/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/mkruskal-google">@mkruskal-google</a>) -- For helping to upgrade tensorflow and protobuf versions.</li> <li><strong>Sowmiya Nagarajan</strong> (@strangest-quark) -- Working on phasing candidate variants.</li> <li><strong>Suchismita Tripathy</strong> (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/sushi15/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/sushi15">@sushi15</a>) -- Improving the SNP and INDEL metrics reporting during training.</li> <li><strong>Francisco Unda</strong> (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/fcoUnda/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/fcoUnda">@fcoUnda</a>) -- Improving the downsampling approach in make_examples to improve representations for low allele frequency variants.</li> <li><strong>Vasiliy Strelnikov</strong> (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/vaxyzek/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/vaxyzek">@vaxyzek</a>) - adding deepsomatic capabilities into nf-core: <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="2517014949" data-permission-text="Title is private" data-url="https://github.com/nf-core/modules/issues/6622" data-hovercard-type="pull_request" data-hovercard-url="/nf-core/modules/pull/6622/hovercard" href="https://github.com/nf-core/modules/pull/6622">nf-core/modules#6622</a></li> <li><strong>Sam Yadav</strong> (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/yadavs33-roche/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/yadavs33-roche">@yadavs33-roche</a>) and <strong>Seraj Ahmad</strong> (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/ahmads9-roche/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/ahmads9-roche">@ahmads9-roche</a>) for their contribution to improve the examples shuffle code.</li> </ul> <p>Student researchers:</p> <ul> <li><strong>Mobin Asri</strong> (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/mobinasri/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/mobinasri">@mobinasri</a>) -- Further improving the implementation of pangenome-aware DeepVariant.</li> <li><strong>Farica Zhuang</strong> (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/faricazjj/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/faricazjj">@faricazjj</a>) -- For contributing to the phasing method within DeepVariant.</li> </ul> kishwarshafin tag:github.com,2008:Repository/111751293/v1.8.0 2024-12-09T23:51:20Z DeepVariant 1.8.0 <p>In this release:</p> <ul> <li> <p><strong>Small model integration:</strong> Speed increased by ~1.7x (40% runtime reduction) for WGS, PacBio, and ONT by introduction of additional small model. The small model identifies easy-to-call sites and invokes the standard DeepVariant model for harder sites. We observe similar or improved accuracies and confidence calibration with this combination. Use of the small model can be disabled with <code>--disable_small_model=true</code> option. For details, please see <a href="https://github.com/google/deepvariant/blob/r1.8/docs/deepvariant-small-model-details.md">small model details doc</a>.</p> </li> <li> <p><strong>Pangenome-aware variant calling:</strong> Added a new ability to directly use information from a pangenome in the process of variant calling. This improves accuracy with both BAMs mapped with standard BWA and with BAMs using vg-Giraffe to a pangenome. Error reduction is ~30% with <a href="https://github.com/google/deepvariant/blob/r1.8/docs/pangenome-aware-wgs-vg-case-study.md">vg-Giraffe mapped WGS</a>, 10% with <a href="https://github.com/google/deepvariant/blob/r1.8/docs/pangenome-aware-wgs-bwa-case-study.md">BWA-mapped WGS</a>, and 5% for <a href="https://github.com/google/deepvariant/blob/r1.8/docs/pangenome-aware-wes-bwa-case-study.md">BWA-mapped WES</a>. See details in <a href="https://github.com/google/deepvariant/blob/r1.8/docs/pangenome-aware-metrics.md">metrics page</a>.</p> </li> <li> <p><strong>Configure a fast pipeline:</strong> Optional mode to increase efficiency for high-throughput GPU implementations. Configurations which pipeline example generation with GPU-based variant calling to increase utilization of GPU resources. See <a href="https://github.com/google/deepvariant/blob/r1.8/docs/deepvariant-fast-pipeline-case-study.md">case study</a> for details.</p> </li> <li> <p>Introduced new Mas-Seq models for variant calling with Kinnex kits/Mas-Seq data. See <a href="https://github.com/google/deepvariant/blob/r1.8/docs/deepvariant-masseq-case-study.md">case study</a> for details.</p> </li> <li> <p>PacBio models are now trained with labels from the Platinum Pedigree, which reduces errors by 34% on this more comprehensive truth set including very difficult parts of the genome.</p> </li> <li> <p>Added SPRQ data to PacBio training datasets, improving accuracy for SPRQ chemistry. Updated the PacBio case study data to 2024 SPRQ release. Reduced error on SPRQ chemistry by 27% percent relative to DeepVariant v1.6. Updating to DeepVariant v1.8 is recommended for SPRQ.</p> </li> <li> <p>Updated how model file metadata is specified, to accommodate more flexible ways of specifying channels. Custom models now require an accompanying example_info.json file containing the image shape details generated during training image generation in make_examples and call_variants stage. An example use of custom model is <a href="https://github.com/google/deepvariant/blob/r1.8/docs/deepvariant-complete-t7-case-study.md">T7 cas-study</a> where you can see <code>example_info.json</code> file is downloaded in this <a href="https://github.com/google/deepvariant/blob/r1.8/docs/deepvariant-complete-t7-case-study.md#download-complete-genomics-t7-model">section</a> to successfully run DeepVariant.</p> </li> </ul> <p>We are thankful for the contributions from:</p> <ul> <li>Mobin Asri (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/mobinasri/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/mobinasri">@mobinasri</a>) and Juan Carlos Mier (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/jmier2/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/jmier2">@jmier2</a>) on pangenome-aware DeepVariant work.</li> <li>Ralf W. Grosse-Kunstleve (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/rwgk/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/rwgk">@rwgk</a>) for helping to migrate from CLIF to pybind.</li> <li>Shiyi Yin (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/yinshiyi/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/yinshiyi">@yinshiyi</a>) for Mas-Seq model work.</li> <li>Maya Venkatraman (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/mv2731/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/mv2731">@mv2731</a>) for helping to explore model architectures.</li> <li>Ben Soudry (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/ben-soudry/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/ben-soudry">@ben-soudry</a>) for helping to streamline channel inputs.</li> <li>Atilla Kiraly (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/akiraly1/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/akiraly1">@akiraly1</a>) and Yuchen Zhou (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/Yuchen-95/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/Yuchen-95">@Yuchen-95</a>) on explainability work.</li> <li>Jorge Gonzalez Mendez (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/jgonzalezmendez/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/jgonzalezmendez">@jgonzalezmendez</a>) on improving the C++ code quality.</li> <li>Stephanie Steele (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/stesteele/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/stesteele">@stesteele</a>) for helping migrate python code to C++.</li> </ul> kishwarshafin tag:github.com,2008:Repository/111751293/r1.8.0 2024-12-04T04:48:58Z r1.8.0: Fix typos and doc formatting <p>PiperOrigin-RevId: 702568997</p> kishwarshafin tag:github.com,2008:Repository/111751293/v1.6.1 2024-03-19T19:20:10Z DeepVariant 1.6.1 <p>In this release:</p> <ul> <li>We fixed a bug in <code>call_variants</code> that caused the step to freeze in cases where there were no examples. This bug was observed and reported in <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="2089887184" data-permission-text="Title is private" data-url="https://github.com/google/deepvariant/issues/764" data-hovercard-type="issue" data-hovercard-url="/google/deepvariant/issues/764/hovercard" href="https://github.com/google/deepvariant/issues/764">#764</a>, <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="2116872858" data-permission-text="Title is private" data-url="https://github.com/google/deepvariant/issues/769" data-hovercard-type="issue" data-hovercard-url="/google/deepvariant/issues/769/hovercard" href="https://github.com/google/deepvariant/issues/769">#769</a>, <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="1986945508" data-permission-text="Title is private" data-url="https://github.com/google/deepsomatic/issues/8" data-hovercard-type="issue" data-hovercard-url="/google/deepsomatic/issues/8/hovercard" href="https://github.com/google/deepsomatic/issues/8">google/deepsomatic#8</a>.</li> <li>Updated <code>libssw</code> library from 1.2.4 to 1.2.5.</li> <li>The same model files are used for v1.6.0 and v1.6.1 for all technologies.</li> </ul> kishwarshafin tag:github.com,2008:Repository/111751293/v1.6.0 2023-10-24T05:09:46Z DeepVariant 1.6.0 <ul> <li>Improved support for haploid regions, chrX and chY. Users can specify haploid regions with a flag. <a href="https://github.com/google/deepvariant/blob/r1.6/docs/deepvariant-xy-calling-case-study.md">Updated case studies</a> show usage and metrics.</li> <li>Added pangenome workflow (FASTQ-to-VCF mapping with VG and DeepVariant calling). <a href="https://github.com/google/deepvariant/blob/r1.6/docs/deepvariant-vg-case-study.md">Case study</a> demonstrates improved accuracy</li> <li>Substantial improvements to DeepTrio de novo accuracy by specifically training DeepTrio for this use case (for chr20 at 30x HG002-HG003-HG004, false negatives reduced from 8 to 0 with DeepTrio v1.4, false positives reduced from 5 to 0).</li> <li>We have added multi-processing ability in <code>postprocess_variants</code> which reduces 48 minutes to 30 minutes for Illumina WGS and 56 minutes to 33 minutes with PacBio.</li> <li>We have added new models trained with Complete genomics data, and added case studies.</li> <li>We have added NovaSeqX to the training data for the WGS model.</li> <li>We have migrated our training and inference platform from Slim to Keras.</li> <li>Force calling with approximate phasing is now available.</li> </ul> <p>We are sincerely grateful to</p> <ul> <li><a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/wkwan/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/wkwan">@wkwan</a> and <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/paulinesho/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/paulinesho">@paulinesho</a> for the contribution to helping in Keras move.</li> <li><a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/lucasbrambrink/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/lucasbrambrink">@lucasbrambrink</a> for enabling multiprocessing in <code>postprocess_variants</code>.</li> <li><a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/MSamman/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/MSamman">@MSamman</a>, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/akiraly1/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/akiraly1">@akiraly1</a> for their contributions.</li> <li>PacBio: William Rowell (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/williamrowell/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/williamrowell">@williamrowell</a>), Nathaniel Echols for their feedback and testing.</li> <li>UCSC: Benedict Paten(<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/benedictpaten/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/benedictpaten">@benedictpaten</a>), Shloka Negi (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/shlokanegi/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/shlokanegi">@shlokanegi</a>), Jimin Park (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/jimin001/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/jimin001">@jimin001</a>), Mobin Asri (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/mobinasri/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/mobinasri">@mobinasri</a>) for the feedback.</li> </ul> kishwarshafin tag:github.com,2008:Repository/111751293/v1.5.0 2023-02-28T18:11:51Z DeepVariant 1.5.0 <ul> <li>New model datatype: <code>--model_type ONT_R104</code> is a new option. Starting from v1.5, DeepVariant natively supports ONT R10.4 simplex and duplex data. <ul> <li>For older ONT chemistry, please continue to use <a href="https://github.com/kishwarshafin/pepper">PEPPER-Margin-DeepVariant</a>.</li> </ul> </li> <li>Incorporated PacBio Revio training data in DeepVariant PacBio model. In our evaluations this single model performs well on both Sequel II and Revio datatypes. Please use DeepVariant v1.5 and later for Revio data.</li> <li>Incorporated Element Biosciences data in WGS models. We found that we could jointly train a short-read WGS model with both Illumina and Element data. Inclusion of Element data improves accuracy on Element without negative effect on Illumina. Please use the WGS model for best results on either Illumina or Element data.</li> <li>Added vg/Giraffe-mapped BAMs to DeepVariant WGS training data (alongside existing BWA). We observed that a single model can be trained for strong results with both BWA and vg/Giraffe.</li> <li>Improved DeepVariant WES model for 100bps exome sequencing thanks to user-reported issues (including <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="1453830031" data-permission-text="Title is private" data-url="https://github.com/google/deepvariant/issues/586" data-hovercard-type="issue" data-hovercard-url="/google/deepvariant/issues/586/hovercard" href="https://github.com/google/deepvariant/issues/586">#586</a> and <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="1469409550" data-permission-text="Title is private" data-url="https://github.com/google/deepvariant/issues/592" data-hovercard-type="issue" data-hovercard-url="/google/deepvariant/issues/592/hovercard" href="https://github.com/google/deepvariant/issues/592">#592</a>).</li> <li>Thanks to Tong Zhu from Nvidia for his suggestion to <a href="https://github.com/google/deepvariant/commit/249e318470395fcc55fd5377f77a67e988288021">improve the logic for shuffling reads</a>.</li> <li>Thanks to Doron Shem-Tov (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/doron-st/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/doron-st">@doron-st</a>) and Ilya Soifer (<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/ilyasoifer/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/ilyasoifer">@ilyasoifer</a>) from Ultima Genomics for adding new functionalities enabled by flags <code>--enable_joint_realignment</code> and <code>--p_error</code>.</li> <li>Thanks to Dennis Yelizarov for improving Google-internal infrastructure for running make_examples.</li> <li>Updated TensorFlow version to 2.11.0. Updated htslib version to 1.13.</li> </ul> pichuan tag:github.com,2008:Repository/111751293/v1.4.0 2024-05-15T20:11:42Z DeepVariant 1.4.0 <ul> <li>Simplified DeepVariant PacBio by introducing <strong>approximate haplotagging</strong>. This means PacBio users who run DeepVariant no longer need to run DeepVariant+WhatsHap+DeepVariant. See <a href="https://github.com/google/deepvariant/blob/r1.4/docs/deepvariant-pacbio-model-case-study.md">PacBio case study</a> for more information.</li> <li>For Illumina WGS and WES, we add an additional feature of read insert size (<code>insert_size</code>) . This reduces errors by <strong>4-10%</strong> for Illumina WGS and WES model. Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/lucasbrambrink/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/lucasbrambrink">@lucasbrambrink</a> for implementing this feature.</li> <li>Reduced the runtime of the <code>postprocess_variants</code> step by <strong>10-30%</strong>. Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/MosheWagner/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/MosheWagner">@MosheWagner</a> for optimizing the code.</li> <li>Included experimental code which explores use of Keras for model architecture. This is not used in production methods, but may be informative to developers seeking examples of Keras applied to similar problems. Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/wkwan/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/wkwan">@wkwan</a> and <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/paulinesho/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/paulinesho">@paulinesho</a> for their contributions.</li> <li>We did not include OpenVINO by default in the Docker images we released. Users can still build their own Docker images with the option turned on as needed.</li> <li><strong>Updated 2022-10-17</strong>: We have released an Illumina RNA-seq model and added an <a href="https://github.com/google/deepvariant/blob/r1.4/docs/deepvariant-rnaseq-case-study.md">RNA-seq case study</a>.</li> </ul> pichuan tag:github.com,2008:Repository/111751293/v1.3.0 2021-12-10T07:12:31Z DeepVariant 1.3.0 <ul> <li>Improved the DeepTrio PacBio models on PacBio Sequel II Chemistry v2.2 by including this data in the training dataset.</li> <li>Improved <code>call_variants</code> speed for PacBio models (both DeepVariant and DeepTrio) by reducing the default window width from 221 to 199, without tradeoff on accuracy. Thanks to <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/lucasbrambrink/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/lucasbrambrink">@lucasbrambrink</a> for conducting the experiments to find a better window width for PacBio.</li> <li>Introduced a new flag <code>--normalize_reads</code> in <code>make_examples</code>, which normalizes Indel candidates at the reads level.This flag is useful to reduce rare cases where an indel variant is not left-normalized. This feature is mainly relevant to joint calling of large cohorts for joint calling, or cases where read mappings have been surjected from one reference to another. It is currently set to False by default. To enable it, add <code>--normalize_reads=true</code> directly to the <code>make_examples</code> binary. If you’re using the <code>run_deepvariant</code> one-step approach, add <code>--make_examples_extra_args="normalize_reads=true"</code>. Currently we don’t recommend turning this flag on for long reads due to potential runtime increase.</li> <li>Added an <code>--aux_fields_to_keep</code> flag to the <code>make_examples</code> step, and set the default to only the auxiliary fields that DeepVariant currently uses. This reduces memory use for input BAM files that have large auxiliary fields that aren’t used in variant calling. Thanks to <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/williamrowell/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/williamrowell">@williamrowell</a> and <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/rhallPB/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/rhallPB">@rhallPB</a> for reporting this issue.</li> <li>Reduced the frequency of logging in <code>make_examples</code> as well as <code>call_variants</code> to address the issue reported in <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="1043772149" data-permission-text="Title is private" data-url="https://github.com/google/deepvariant/issues/491" data-hovercard-type="issue" data-hovercard-url="/google/deepvariant/issues/491/hovercard" href="https://github.com/google/deepvariant/issues/491">#491</a>.</li> </ul> pichuan