An Energy-Efficient Parallelism Scheme for Deep Neural Network Training and Inferencing on Heterogeneous Cloud Resources

Hong Ju Jeong, Hacksung Boo, Jiseung Bae, Mincheol Jeon, Eui Nam Huh

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The emergence of Large Language Models(LLM) and generative AI has led to an explosive increase in computational demands across cloud computing data centers. The growing number of parameters in deep learning models results in significant power consumption problem, leading to the need for cost-effective and eco-friendly data centers. Furthermore, with the advent of multi-cloud environments, deep learning computations, not only for training but also for inference, no longer occur on a single hardware unit but are distributed across various heterogeneous hardware nodes forming clusters. In this paper, we present solutions to these challenges from a parallelism perspective. Considering the characteristics of the models, we implement data parallelism and model parallelism, partitioning models and data across heterogeneous hardware nodes for power-efficient learning and inferencing. To quantify the impact, we measured the power consumption of CPUs, GPUs, and RAM during the experiments, providing insights into the energy efficiency of the proposed partitioning strategies. Furthermore, we conducted a carbon footprint analysis, converting the measured power consumption into equivalent carbon emissions. The study highlights the necessity of partitioning research for energy-efficient learning and inferencing, addressing the identified issues.

Original languageEnglish
Title of host publicationICIIT 2024 - Proceedings of the 2024 9th International Conference on Intelligent Information Technology
PublisherAssociation for Computing Machinery
Pages493-498
Number of pages6
ISBN (Electronic)9798400716713
DOIs
Publication statusPublished - 23 Feb 2024
Event2024 9th International Conference on Intelligent Information Technology, ICIIT 2024 - Ho Chi Minh, Viet Nam
Duration: 23 Feb 202425 Feb 2024

Publication series

NameACM International Conference Proceeding Series

Conference

Conference2024 9th International Conference on Intelligent Information Technology, ICIIT 2024
Country/TerritoryViet Nam
CityHo Chi Minh
Period23/02/2425/02/24

Bibliographical note

Publisher Copyright:
© 2024 Copyright held by the owner/author(s).

Keywords

  • Carbon footprint
  • Deep learning
  • Heterogeneous
  • Parallelism
  • Power consumption

Fingerprint

Dive into the research topics of 'An Energy-Efficient Parallelism Scheme for Deep Neural Network Training and Inferencing on Heterogeneous Cloud Resources'. Together they form a unique fingerprint.

Cite this