-
Notifications
You must be signed in to change notification settings - Fork 31
Open
Labels
featureNew feature or requestNew feature or requestlauzhack2022good issue for lauzhack 2022good issue for lauzhack 2022
Description
PROBLEM:
Distributed learning has a higher carbon cost than centralized learning (https://arxiv.org/abs/2102.07627)

The problem is worsened in non-IID settings

POSSIBLE SOLUTIONS:
This can be mitigated by several approaches
- model compression
- communication compression (e.g. powerSGD)
- client selection
- especially for non-IID settings: we can select clients to preferentially learn from based on a trade-off between thevalue of the client's dataand thecarbon cost using it--> can be seen as a form of model personalization - others I wont list yet...
PROPOSED APPROACH
-
The first step is computing the carbon footprint
The carbon footprint of distributed learning=cost of compute+cost communication- Both are dependent on the carbon intensity https://app.electricitymaps.com/map of where the computation takes place (i.e. energy mix of the country or data centre for example)
- Our tool, CUMULATOR https://pypi.org/project/cumulator/ computes this (albeit imperfectly as it does not take into consideration some important real-world hardware issues, but lets ignore that for now)
-
Next is adapting and integrating CUMULATOR into DISCO
- OPTION 1: Sending CUMULATOR to each user --> monitoring local compute and comms --> sending results to be aggregated --> report results to all users
- This is maybe unfeasible, invasive and creates more communication overhead than it is worth....to be explored - OPTION 2: Asking each user to collect data on the determinants of carbon footprint (GPU/CPU brand, its geographical location, epochs of local learning and number of communication rounds) --> then either communicating those directly or some privacy-preserving composite of them --> prediction of the carbon footprint centrally from these metrics compared to the model run on a benchmark GPU centrally
- others... to discuss
- OPTION 1: Sending CUMULATOR to each user --> monitoring local compute and comms --> sending results to be aggregated --> report results to all users
-
Finally, displaying the results
- Displaying tangible results to users to communicate the cost of training the model.
- Perhaps add some tips about how to reduce this cost
- Definitely show the "value" gained for each "unit" of carbon
Example:
Each 1% of Accuracy/F1 etc costs 10Wh (or 1 week of an average tree's carbon recycling capacity)
- Once the above are done...we can explore carbon-optimized learning
- Monitoring the carbon impact of optimization techniques like compression
- Make the tradeoffs necessary to perform "client selection" (where we decide who to communicate with to maximize the trade-off between accuracy vs carbon footprint)
- etc!
Lauzhack candidate task
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
featureNew feature or requestNew feature or requestlauzhack2022good issue for lauzhack 2022good issue for lauzhack 2022