-
Notifications
You must be signed in to change notification settings - Fork 2
Expand file tree
/
Copy pathindex.html
More file actions
91 lines (68 loc) · 7.85 KB
/
index.html
File metadata and controls
91 lines (68 loc) · 7.85 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
<!DOCTYPE html>
<html lang="en-US">
<head>
<meta charset="UTF-8">
<!-- Begin Jekyll SEO tag v2.7.1 -->
<title>The Auto Arborist Dataset</title>
<meta name="generator" content="Jekyll v3.9.0" />
<meta property="og:title" content="The Auto Arborist Dataset" />
<meta property="og:locale" content="en_US" />
<meta name="description" content="Sara Beery, Guanhang Wu, Trevor Edwards, Filip Pavetic, Bo Majewski, Shreyasee Mukherjee, Stanley Chan, John Morgan, Vivek Rathod, Jonathan Huang</br>Caltech, Google" />
<meta property="og:description" content="Sara Beery, Guanhang Wu, Trevor Edwards, Filip Pavetic, Bo Majewski, Shreyasee Mukherjee, Stanley Chan, John Morgan, Vivek Rathod, Jonathan Huang</br>Caltech, Google" />
<link rel="canonical" href="https://google.github.io/auto-arborist/" />
<meta property="og:url" content="https://google.github.io/auto-arborist/" />
<meta property="og:site_name" content="The Auto Arborist Dataset" />
<meta name="twitter:card" content="summary" />
<meta property="twitter:title" content="The Auto Arborist Dataset" />
<script type="application/ld+json">
{"description":"Sara Beery, Guanhang Wu, Trevor Edwards, Filip Pavetic, Bo Majewski, Shreyasee Mukherjee, Stanley Chan, John Morgan, Vivek Rathod, Jonathan Huang</br>Caltech, Google","@type":"WebSite","headline":"The Auto Arborist Dataset","url":"https://google.github.io/auto-arborist/","name":"The Auto Arborist Dataset","@context":"https://schema.org"}</script>
<!-- End Jekyll SEO tag -->
<link rel="preconnect" href="https://fonts.gstatic.com">
<link rel="preload" href="https://fonts.googleapis.com/css?family=Open+Sans:400,700&display=swap" as="style" type="text/css" crossorigin>
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="theme-color" content="#157878">
<meta name="apple-mobile-web-app-status-bar-style" content="black-translucent">
<link rel="stylesheet" href="assets/css/style.css">
</head>
<body>
<header class="page-header" role="banner">
<h1 class="project-name">The Auto Arborist Dataset</h1>
<h2 class="project-tagline">Sara Beery, Guanhang Wu, Trevor Edwards, Filip Pavetic, Bo Majewski, Shreyasee Mukherjee, Stanley Chan, John Morgan, Vivek Rathod, Jonathan Huang</br>Caltech, Google</h2>
<a href="https://openaccess.thecvf.com/content/CVPR2022/html/Beery_The_Auto_Arborist_Dataset_A_Large-Scale_Benchmark_for_Multiview_Urban_CVPR_2022_paper.html" style="font-size:1.3rem" class="btn">Paper</a>
</header>
<main id="content" class="main-content" role="main">
<c><i><b>The Auto Arborist dataset is a multiview fine-grained visual categorization dataset that contains over 2 million trees belonging to over 300 genus-level categories in 23 cities across the US and Canada built to foster the development of robust methods for large-scale urban forest monitoring. The dataset was initially released as part of a <a href="https://openaccess.thecvf.com/content/CVPR2022/html/Beery_The_Auto_Arborist_Dataset_A_Large-Scale_Benchmark_for_Multiview_Urban_CVPR_2022_paper.html">CVPR 2022 publication</a>. Data use and access information <a href="#data_use">here</a>.</b></c></i>
<p style="margin-top:1cm;"><h1>Dataset Overview</h1></p>
<p><img src="assets/Auto_Arborist_Overview.png" alt="Overview of Auto Arborist Dataset" class="center-image"></p>
<c><i>The 23 cities in our dataset are spread across the US and Canada, and are categorized into West, Central, and East Regions to enable us to study how well models generalize in a spatial and hierarchical manner.</c></i>
<p><img src="assets/tree_gif.gif" alt="Trees in Cities in Auto Arborist" class="center-image-wide" /></p>
<c><i>Tree records per city. Our dataset contains >300 tree genus classes. Note the shifts in distribution of common genera across cities (visualized by color). </c></i>
<p>
We propose <b>urban forest monitoring</b> as an ideal testbed for working on several computer vision challenges (domain generalization, fine-grained categorization, long-tail learning, multiview vision), while working towards filling a crucial environmental and societal need. Urban forests provide significant benefits to urban societies (e.g., cleaner air and water, carbon sequestration, and energy savings among others). However, planning and maintaining these forests is expensive. One particularly costly aspect of urban forest management is monitoring the existing trees in a city: e.g., tracking tree locations, species, and health. Monitoring efforts are currently based on tree censuses built by human experts, costing cities millions of dollars per census and thus collected infrequently.
</p>
<p><img src="assets/table.png" alt="Tree records" class="center-image-wide" /></p>
<c><i>The number of tree records in each city, with the heldout cities in bold.</c></i>
<p>
Previous investigations into automating urban forest monitoring focused on small datasets from single cities, covering only common categories. To address these shortcomings, we introduce a new large-scale dataset that joins public tree censuses from 23 cities with a large collection of street level and aerial imagery. Our Auto Arborist dataset contains over <b>2.5M trees</b> and over <b>300 genera</b> and is more than 2 orders of magnitude larger than the closest dataset in the literature. In our paper we introduce baseline results on our dataset across modalities as well as metrics for the detailed analysis of generalization with respect to geographic distribution shifts, vital for such a system to be deployed at-scale.
</p>
<p style="margin-top:1cm;"><h1>Dataset Challenges</h1></p>
<p>
<b>Generalization to novel domains is a fundamental challenge for computer vision.</b> Near-perfect accuracy on benchmarks is common, but these models do not work as expected when deployed outside of the training distribution. To build computer vision systems that truly solve real-world problems at global scale, we need benchmarks that fully capture real-world complexity, including geographic domain shift, long-tailed distributions, and data noise.
</p>
<p style="text-align: center"><img src="assets/long_tailed.png" alt="fine-grained" class="center" width="600"></p>
<c><i>The data has a long-tailed distribution across categories, meaning that the majority of the examples in the dataset come from just a few frequent categories, and many of the examples have far fewer examples. We characterize each genera as frequent, common, or rare based on the number of training examples we have for that genera. Note that our test data is split spatially from our training data within each city, so not all rare species are seen in test.</c></i>
<p style="text-align: center"><img src="assets/taxonomy.png" alt="taxonomy" class="center" ></p>
<p style="text-align: center"><img src="assets/taxonomy_crop.png" alt="taxonomy zoomed in" class="center" width=300 ></p>
<c><i>This dendrogram shows the taxonomic structure of the genera in Auto Arborist. The dataset is taxonomically diverse, with >300 different genera represented.</c></i>
<p><img src="assets/blur.png" alt="blurred imagery" class="center"></p>
<c><i>Examples of street level data after blurring for privacy.</c></i>
<p style="margin-top:1cm;"><a id="data_use"><h1>Data Use and Access</h1></a></p>
<p> We may post updates about the project and dataset on our Google Group:
<a href=https://groups.google.com/g/auto-arborist>https://groups.google.com/g/auto-arborist</a>. </p>
<p>Due to maintenance constraints, the dataset was turned down on May 27, 2025. You can reach out to us at: auto-arborist-external [at] google [dot] com for information about the dataset, but we are unable to offer the dataset anymore. </p>
<footer class="site-footer">
<span class="site-footer-credits">This page was generated by <a href="https://pages.github.com">GitHub Pages</a>.</span>
</footer>
</main>
</body>
</html>