Abstract

The computation power from graphics processing units (GPUs) has become prevalent in many fields of computer engineering. Massively parallel workloads and large data set capabilities make GPUs an essential asset in tackling today's computationally intensive problems. One field that benefited greatly with the introduction of GPUs is machine learning. Many applications of machine learning use algorithms that show a significant speedup on a GPU compared to other processors due to the massively parallel nature of the problem set. The existing cache architecture, however, may not be ideal for these applications. The goal of this thesis is to determine if a cache architecture for the GPU can be redesigned to better fit the needs of this increasingly popular field of computer engineering.

This work uses a cycle accurate GPU simulator, Multi2Sim, to analyze NVIDIA GPU architectures. The architectures are based on the Kepler series, but the flexibility of the simulator allows for emulation of newer features. Changes have been made to source code to expand on the metrics recorded to further the understanding of the cache architecture. Two suites of benchmarks were used: one for general purpose algorithms and another for machine learning. Running the benchmarks with various cache configurations led to insight into the effects the cache architecture had on each of them. Analysis of the results shows that the cache architecture, while beneficial to the general purpose algorithms, does not need to be as complex for machine learning algorithms. A large contributor to the complexity is the cache coherence protocol used by GPUs. Due to the high spacial locality associated with machine learning problems, the overhead needed by implementing the coherence protocol has little benefit, and simplifying the architecture can lead to smaller, cheaper, and more efficient designs.

Library of Congress Subject Headings

Graphics processing units; Cache memory; Computer architecture; Machine learning

Publication Date

7-2019

Document Type

Thesis

Student Type

Graduate

Degree Name

Computer Engineering (MS)

Department, Program, or Center

Computer Engineering (KGCOE)

Advisor

Sonia Lopez Alarcon

Advisor/Committee Member

Marcin Lukowiak

Advisor/Committee Member

Roy Melton

Recommended Citation

Kotas, Gerald, "Exploration of GPU Cache Architectures Targeting Machine Learning Applications" (2019). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/10159

Campus

RIT – Main Campus

Plan Codes

CMPE-MS

Download

COinS

Theses

Exploration of GPU Cache Architectures Targeting Machine Learning Applications

Abstract

Library of Congress Subject Headings

Publication Date

Document Type

Student Type

Degree Name

Department, Program, or Center

Advisor

Advisor/Committee Member

Advisor/Committee Member

Recommended Citation

Campus

Plan Codes

Search

Browse

Author Corner

RIT Links

Theses

Exploration of GPU Cache Architectures Targeting Machine Learning Applications

Author

Abstract

Library of Congress Subject Headings

Publication Date

Document Type

Student Type

Degree Name

Department, Program, or Center

Advisor

Advisor/Committee Member

Advisor/Committee Member

Recommended Citation

Campus

Plan Codes

Share

Search

Browse

Author Corner

RIT Links