<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://lnerbonne.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://lnerbonne.github.io/" rel="alternate" type="text/html" /><updated>2025-05-20T18:57:36+00:00</updated><id>https://lnerbonne.github.io/feed.xml</id><title type="html">Lucas Nerbonne’s Portfolio</title><subtitle>Write an awesome description for your new site here. You can edit this line in _config.yml. It will appear in your document head meta (for Google search results) and in your feed.xml site description.</subtitle><author><name>Lucas Nerbonne</name></author><entry><title type="html">Sterling Reproduction Study</title><link href="https://lnerbonne.github.io/blog/sterlingresults/" rel="alternate" type="text/html" title="Sterling Reproduction Study" /><published>2025-05-20T00:00:00+00:00</published><updated>2025-05-20T00:00:00+00:00</updated><id>https://lnerbonne.github.io/blog/sterlingresults</id><content type="html" xml:base="https://lnerbonne.github.io/blog/sterlingresults/"><![CDATA[<p>For my third and final project of this semester’s OpenGIScience course, my classmate Jorre Dahl and I undertook a reproduction of Charles Sterling’s 2023 study titled “Connections between present-day water access and historical redlining”. Over the course of this reproduction we wrangled huge census datasets, dealt with broken geometries throughout datasets, and reproduced Sterling’s regression analysis on water access in redlined neighborhoods. I learned a lot about both the application of and structure behind geographic methods and am extremely proud of our result.</p>

<p>View the final report <a href="https://jorredahl.github.io/RPr-Sterling-2023/Final%20Report.html">here</a></p>

<p>Or view the GitHub repository <a href="https://github.com/jorredahl/RPr-Sterling-2023">here</a></p>

<p>This project is contributing a well-documented open-source series of research methods that conceivibly could be applied to many different questions pertaining to the historical effects of redlining. This workflow is made accessible through full data and code transparancy and by limiting implementation to one language/progam, finding ways to conduct an analysis that Sterling did in R, ArcGIS, and Python in just R.</p>

<h3 id="references">References</h3>

<blockquote>
  <p>Sterling III, Charles W., et al. “Connections between present-day water access and historical redlining.” Environmental Justice (2023). DOI:<a href="https://doi.org/10.1089/env.2022.0115">10.1089/env.2022.0115</a></p>
</blockquote>]]></content><author><name>Lucas Nerbonne</name></author><category term="Blog" /><category term="Reproducibility" /><category term="GIscience" /><category term="Data Analysis" /><summary type="html"><![CDATA[For my third and final project of this semester’s OpenGIScience course, my classmate Jorre Dahl and I undertook a reproduction of Charles Sterling’s 2023 study titled “Connections between present-day water access and historical redlining”. Over the course of this reproduction we wrangled huge census datasets, dealt with broken geometries throughout datasets, and reproduced Sterling’s regression analysis on water access in redlined neighborhoods. I learned a lot about both the application of and structure behind geographic methods and am extremely proud of our result.]]></summary></entry><entry><title type="html">Jay Chakraborty Reproduction Project</title><link href="https://lnerbonne.github.io/blog/chakrabortyresults/" rel="alternate" type="text/html" title="Jay Chakraborty Reproduction Project" /><published>2025-05-11T00:00:00+00:00</published><updated>2025-05-11T00:00:00+00:00</updated><id>https://lnerbonne.github.io/blog/chakrabortyresults</id><content type="html" xml:base="https://lnerbonne.github.io/blog/chakrabortyresults/"><![CDATA[<p>For my next project for OpenGIScience, I undertook a partial reproduction of Jayajit Chakraborty’s 2021 study <code class="language-plaintext highlighter-rouge">Social inequities in the distribution of COVID-19: An intra-categorical analysis of people with disabilities in the U.S.</code>, which sought to investigation the connection between Covid 19 Incidence rate on the county level and the prevalence by a variety of demographic subgroups. The author was specifically interested in the correlation between disability rate and Covid cases, something that would be interesting for both public health officials and policymakers.</p>

<p>This project gave me the opportunity to dive into the nuts and bolts of epidemiological clustering functions through the author’s choice of using clusters of high incidence counties as a spatial control, with interesting results that I’m still working through.</p>

<p>If you want to check out the project results they’re here: <a href="https://github.com/lnerbonne/Covid_19_Clustering_Original/Final_Analysis_and_Report.html">https://lnerbonne/Covid_19_Clustering_Original/blob/main/docs/report/Final_Analysis_and_Report.html</a></p>

<p>The Github repo for the project can be found here: <a href="https://github.com/lnerbonne/Covid_19_Clustering_Original">https://github.com/lnerbonne/Covid_19_Clustering_Original</a></p>

<h3 id="references">References</h3>

<p>Chakraborty, J. 2021. Social inequities in the distribution of COVID-19: An intra-categorical analysis of people with disabilities in the U.S. <em>Disability and Health Journal</em> <strong>14</strong>:1-5. DOI:<a href="https://doi.org/10.1016/j.dhjo.2020.101007">10.1016/j.dhjo.2020.101007</a></p>]]></content><author><name>Lucas Nerbonne</name></author><category term="Blog" /><category term="Reproducibility" /><category term="GIscience" /><category term="Data Analysis" /><summary type="html"><![CDATA[For my next project for OpenGIScience, I undertook a partial reproduction of Jayajit Chakraborty’s 2021 study Social inequities in the distribution of COVID-19: An intra-categorical analysis of people with disabilities in the U.S., which sought to investigation the connection between Covid 19 Incidence rate on the county level and the prevalence by a variety of demographic subgroups. The author was specifically interested in the correlation between disability rate and Covid cases, something that would be interesting for both public health officials and policymakers.]]></summary></entry><entry><title type="html">Gerrymander Analysis Results</title><link href="https://lnerbonne.github.io/blog/gerrymanderresults/" rel="alternate" type="text/html" title="Gerrymander Analysis Results" /><published>2025-03-14T00:00:00+00:00</published><updated>2025-03-14T00:00:00+00:00</updated><id>https://lnerbonne.github.io/blog/gerrymanderresults</id><content type="html" xml:base="https://lnerbonne.github.io/blog/gerrymanderresults/"><![CDATA[<p>Excited to publish the results of my Alabama Gerrymandering analysis! The new population difference metric did a pretty good job at benchmarking gerrymandering - check out the final product down below!</p>

<p><a href="https://lnerbonne.github.io/gerrymanderAL/report/results.html">View my Alabama Gerrymandering Analysis</a></p>

<p>I was really intruiged by the full circle completion of my first round of this pre-planning-execution-results cycle. I especially found it useful to force myself to interrogate the metadata of the datasources I was using; too often I assume I know what’s in a file and don’t take the time to ensure that I understand it well.</p>

<p>I’m excited to get to work on my next project- a clustering and demographics analysis reproduction from a study on county-level Covid 19 data from August of 2020.</p>]]></content><author><name>Lucas Nerbonne</name></author><category term="Blog" /><category term="Reproducibility" /><category term="GIscience" /><category term="Data Analysis" /><category term="MAPS" /><category term="Results" /><summary type="html"><![CDATA[Excited to publish the results of my Alabama Gerrymandering analysis! The new population difference metric did a pretty good job at benchmarking gerrymandering - check out the final product down below!]]></summary></entry><entry><title type="html">On procedure documentation: how is it changing the way that I work?</title><link href="https://lnerbonne.github.io/blog/gerrymanderprocedure/" rel="alternate" type="text/html" title="On procedure documentation: how is it changing the way that I work?" /><published>2025-02-23T00:00:00+00:00</published><updated>2025-02-23T00:00:00+00:00</updated><id>https://lnerbonne.github.io/blog/gerrymanderprocedure</id><content type="html" xml:base="https://lnerbonne.github.io/blog/gerrymanderprocedure/"><![CDATA[<p>As a researcher it’s sometimes interesting to step back and ask myself “how did I get to this place in my project?”. Oftentimes in the middle of the research process I can give you a general idea- first I tried x, then I pivoted to y, and now I’m working on z ect. Whatever path I’ve taken through the research process oftentimes feels like it makes perfect sense to me, even if I can barely tell you why I decided to head down each road in the first place.</p>

<p>This process was tested this week as I set out to write my pre-research plan for an analysis of congressional gerrymandering in Alabama. Instead of my usual let-it-rip process I instead utilized HEGSRR’s <a href="https://github.com/HEGSRR/HEGSRR-Template">open source template</a> for reproducible geographic research to pre-plan my research approach. This included documenting data source metadata, recording processing environment and package metadata, and detailing data transformations before I ever hit ‘run’ on any code. This forced me to really think intentionally about what I wanted to get out of each piece of data. Additionally, I spent time dictating what different results would mean in context of the study in an attempt to discourage cherry-picking significant results.</p>

<p>This process was especially interesting to me because it forced me to put to paper (or vsc, in this case) what my thought process was. In a lot of ways this is what I’ve spent the most time developing over my four years at Middlebury; hard skills come and go, but what doesn’t is your ability to look over a dataset and make decisions about how to treat data. I’ll be curious to see how my workflow does during implementation, which should get done this week.</p>

<p><a href="https://lnerbonne.github.io/gerrymanderAL/report/pre_planning_documentation.html">View my Pre-Planning documentation</a></p>]]></content><author><name>Lucas Nerbonne</name></author><category term="Blog" /><category term="Reproducibility" /><category term="GIscience" /><category term="Data Analysis" /><category term="Procedure" /><category term="Study Pre-Planning" /><summary type="html"><![CDATA[As a researcher it’s sometimes interesting to step back and ask myself “how did I get to this place in my project?”. Oftentimes in the middle of the research process I can give you a general idea- first I tried x, then I pivoted to y, and now I’m working on z ect. Whatever path I’ve taken through the research process oftentimes feels like it makes perfect sense to me, even if I can barely tell you why I decided to head down each road in the first place.]]></summary></entry><entry><title type="html">Is GIScience Reproducible?</title><link href="https://lnerbonne.github.io/blog/giscience/" rel="alternate" type="text/html" title="Is GIScience Reproducible?" /><published>2025-02-10T00:00:00+00:00</published><updated>2025-02-10T00:00:00+00:00</updated><id>https://lnerbonne.github.io/blog/giscience</id><content type="html" xml:base="https://lnerbonne.github.io/blog/giscience/"><![CDATA[<p>As the scientific community grows and publication rates increase, it’s paramount for the applicability and trustworthiness of this increasing amount of data to scale with the sheer volume of new information being presented yearly, irrespective of the discipline. In tandem with the wider accessibility of the sciences worldwide, this has resulted in the number of scientific papers published annually more than tripling <a href="[text](https://ncses.nsf.gov/pubs/nsb202333/publication-output-by-region-country-or-economy-and-by-scientific-field)">since 2000</a>. This trend is especially true for geography, a discipline that could be said to be having somewhat of a ‘kid in a candy store’ boom of research potential with exponential increases in recorded image volume every year. For even just two satellites- ESA’s Sentinel 1 and 2- downlinked data volume is over <a href="[text](https://www.google.com/url?q=https://gis.stackexchange.com/questions/286644/what-is-the-annual-data-volume-produced-by-the-individual-sentinel-satellite-mis&amp;sa=D&amp;source=docs&amp;ust=1739747729649896&amp;usg=AOvVaw1zzPQG7BfooNY4Q9VHY_RV)">3 petabytes annually</a>. If printed, this data would be enough to fill 1.5 trillion sheets of 8.5 x 11 paper. As the sheer amount of data and its accessibility increases year-to-year, so too do the opportunities for scientific breakthroughs that warrant dissemination through publication.</p>

<p>As the number of papers published annually has increased, so have concerns that many of the findings presented within are not sufficiently <em>reproducible</em>. This lack of reproducibility goes against one of the central tenets of the Scientific Method; the communality of science allows for work that builds off of the findings of others, trusting in the validity of their work to make further conclusions. This field-wide lack of reproducibility has been frequently publicized in the fields of psychology and medicine as a replicability crisis that threatens to call many things previously believed to be scientific fact into question. A lack of reproducibility can stem from various factors, including failure to share research data, withholding statistical analysis code, or inconsistencies in methodology that lead to others not being able to replicate your work. These lapses represent failures of the scientific process as it means that other researchers aren’t necessarily able to rely on the validity of your results.</p>

<p>GIScience, being a subdiscipline of geography focused mainly on solving problems through spatial data analysis, is especially well suited to both effectively take advantage of the scientific community’s newfound glut of spatial information and to pave the way for new standards of reproducibility. This could look like several things: sharing specific information on how data was acquired (so that others can get the same data), sharing code from data manipulation (so others can check for mistakes), and documenting your process of analysis (so that others can question what priors you may have introduced to the work). GIScience is well-positioned to do this for a couple of reasons. Geographic data sources are sometimes publicly available and unlike many other ‘hard’ sciences often don’t involve field data collection, something that can’t necessarily be repeated the same way twice. Additionally, computational geographic techniques allow researchers to make their entire workflow public, allowing for the exact duplication of work by others to verify results and learn from techniques used.</p>

<p>If this is so possible, are we doing it? Broadly speaking, no. <a href="[text](https://josephholler.shinyapps.io/rpr-survey/)">When surveyed</a>, more than half of geographers said they were either only sometimes or rarely/ever using open-source software to communicate their research findings. Similarly, more than 3/4 of the respondents said they only sometimes/rarely/ever attempt to share the code used in their research. While this usage rate appears poor, the motivation to share work is there; approximately 75% of respondents said that reproducibility was either very or somewhat important within their subfield. This begs the question: where is the disconnect? If researchers both know that reproducibility is important but still aren’t following through with implementing these practices in their own work, how can we get them to yes and ‘bake in’ the expectation of open-source research into the GIScience field? Likely the answers lie in incentivising the more diligent and time-consuming work that it takes to truly make a project open source and convincing researchers that it’s in their best interest to work together, along with giving researchers resources to learn the skills that are necessary to adequately share their work in a reproducible way.</p>

<p>I’ll be thinking more about this in the weeks ahead, so stay tuned for my definitive and all-encompassing findings (ha ha). I might even share my thought process (so open-source of me)!</p>

<h3 id="references">References</h3>

<p><a href="https://ncses.nsf.gov/pubs/nsb202333/publication-output-by-region-country-or-economy-and-by-scientific-field">Research Publication Output Over Time</a></p>

<p><a href="https://www.google.com/url?q=https://gis.stackexchange.com/questions/286644/what-is-the-annual-data-volume-produced-by-the-individual-sentinel-satellite-mis&amp;sa=D&amp;source=docs&amp;ust=1739747729649896&amp;usg=AOvVaw1zzPQG7BfooNY4Q9VHY_RV">Sentinel 1/2 Data</a></p>

<p><a href="https://josephholler.shinyapps.io/rpr-survey/">Geographer Open-Source Opinions/Practice</a></p>]]></content><author><name>Lucas Nerbonne</name></author><category term="Blog" /><category term="Reproducibility" /><category term="GIscience" /><category term="Data Analysis" /><summary type="html"><![CDATA[As the scientific community grows and publication rates increase, it’s paramount for the applicability and trustworthiness of this increasing amount of data to scale with the sheer volume of new information being presented yearly, irrespective of the discipline. In tandem with the wider accessibility of the sciences worldwide, this has resulted in the number of scientific papers published annually more than tripling since 2000. This trend is especially true for geography, a discipline that could be said to be having somewhat of a ‘kid in a candy store’ boom of research potential with exponential increases in recorded image volume every year. For even just two satellites- ESA’s Sentinel 1 and 2- downlinked data volume is over 3 petabytes annually. If printed, this data would be enough to fill 1.5 trillion sheets of 8.5 x 11 paper. As the sheer amount of data and its accessibility increases year-to-year, so too do the opportunities for scientific breakthroughs that warrant dissemination through publication.]]></summary></entry></feed>