Skip to content

Carbon-optimized learning #501

@Annie-LiGHT

Description

@Annie-LiGHT

PROBLEM:

Distributed learning has a higher carbon cost than centralized learning (https://arxiv.org/abs/2102.07627)
Screenshot 2022-10-25 at 21 28 13

The problem is worsened in non-IID settings
Screenshot 2022-10-25 at 21 28 39


POSSIBLE SOLUTIONS:

This can be mitigated by several approaches

  1. model compression
  2. communication compression (e.g. powerSGD)
  3. client selection
    - especially for non-IID settings: we can select clients to preferentially learn from based on a trade-off between the value of the client's data and the carbon cost using it--> can be seen as a form of model personalization
  4. others I wont list yet...

PROPOSED APPROACH

  1. The first step is computing the carbon footprint
    The carbon footprint of distributed learning = cost of compute + cost communication

    • Both are dependent on the carbon intensity https://app.electricitymaps.com/map of where the computation takes place (i.e. energy mix of the country or data centre for example)
    • Our tool, CUMULATOR https://pypi.org/project/cumulator/ computes this (albeit imperfectly as it does not take into consideration some important real-world hardware issues, but lets ignore that for now)
  2. Next is adapting and integrating CUMULATOR into DISCO

    • OPTION 1: Sending CUMULATOR to each user --> monitoring local compute and comms --> sending results to be aggregated --> report results to all users
      - This is maybe unfeasible, invasive and creates more communication overhead than it is worth....to be explored
    • OPTION 2: Asking each user to collect data on the determinants of carbon footprint (GPU/CPU brand, its geographical location, epochs of local learning and number of communication rounds) --> then either communicating those directly or some privacy-preserving composite of them --> prediction of the carbon footprint centrally from these metrics compared to the model run on a benchmark GPU centrally
    • others... to discuss
  3. Finally, displaying the results

  • Displaying tangible results to users to communicate the cost of training the model.
  • Perhaps add some tips about how to reduce this cost
  • Definitely show the "value" gained for each "unit" of carbon

Example:

Each 1% of Accuracy/F1 etc costs 10Wh (or 1 week of an average tree's carbon recycling capacity)

  1. Once the above are done...we can explore carbon-optimized learning
  • Monitoring the carbon impact of optimization techniques like compression
  • Make the tradeoffs necessary to perform "client selection" (where we decide who to communicate with to maximize the trade-off between accuracy vs carbon footprint)
  • etc!

Lauzhack candidate task

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureNew feature or requestlauzhack2022good issue for lauzhack 2022

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions