About 46,000 results
Open links in new tab
  1. Distributed Inference with Deep Learning Models across …

    Recent years witnessed an increasing research attention in deploying deep learning models on edge devices for inference. Due to limited capabilities and power c

  2. Performance Prediction for Deep Learning Models With Pipeline …

    Jul 11, 2023 · In this article, we propose TPPNet, a transformer-based model for predicting the inference performance of various DL models with the pipeline inference strategy.

  3. PipeEdge: Pipeline Parallelism for Large-Scale Model Inference on ...

    Deep neural networks with large model sizes achieve state-of-the-art results for tasks in computer vision and natural language processing. However, such models

  4. Automatic Pipeline Parallelism: A Parallel Inference Framework for …

    In this paper, we propose Automatic Pipeline Parallelism ( AP2 ), a parallel inference framework for deep learning applications in 6G mobile communication systems, to improve the model …

  5. NAIR: An Efficient Distributed Deep Learning Architecture for …

    Abstract: The distributed deep learning architecture can support the front-deployment of deep learning systems in resource constrained Internet of Things devices and is attracting …

  6. Adaptive and Resilient Model-Distributed Inference in Edge …

    In this paper, we analyze the potential of model-distributed inference in edge computing systems. Then, we develop an Adaptive and Resilient Model-Distributed Inference (AR-MDI) algorithm …

  7. Accelerating Deep Learning Inference via Model Parallelism and …

    In this paper, we take advantage of intrinsic DNN computation characteristics and propose a novel Fused-Layer-based (FL-based) DNN model parallelism method to accelerate inference.

  8. DeepBoot: Dynamic Scheduling System for Training and Inference …

    Jul 11, 2023 · Our implementation on the testbed and large-scale simulation in Microsoft deep learning workload shows that DeepBoot can achieve 32% and 38% average JCT reduction …

  9. Extendable Multi-Device Collaborative Pipeline Parallel Inference in ...

    Oct 18, 2024 · Therefore, we propose a multi-device collaborative pipeline parallel inference method to diminish model inference time in the edge-cloud scenario. This method consists of …

  10. End-Edge-Cloud Collaborative Computing for Deep Learning: A ...

    Therefore, this paper: 1) analyzes the collaborative elements within the end-edge-cloud computing system for deep learning, and proposes collaborative training, inference, and …

Refresh