3.) Memory Management and Caching
Commodity component memories are comprised of main memory and one or maybe more levels of memory caches. These very caches are supplied in order to help lower the time delay of chunks of data being transferred over from memory to the CPU. As the data transfers itself over from the main memory to a CPU, the access speed- or the time between the demand for retrieval and the loading of the data rises, but the capacity for holding the data decreases because each cache closer to the CPU is smaller than the previous cache. Programmers are urged to expand the utilization of data in the Level 1 cache, which is closest to the CPU, where it can be handled in the most efficient manner. The second, and most of the time third cache levels and the main memory hold more and more data as time goes on, however with more and more delay before it can be finally be treated. Successful handling of data that streams through memory caches into and out of the CPU is able to create a strong development in code performance. Because of this, programmers who want to optimize their code have to dedicate some serious time and effort into the process of transferring the data flow through the caches to the CPUs and back.
2.) Threads and Message Passing
There happen to be two different options when it comes to achieving a state of parallelism in the act of computing. One way is to utilize various CPUs on a node to complete certain aspects of a process. For instance, you have the option to divide a loop into four loops a fraction of the original size and run them on separate CPUs all at the same time. This particular action is known as threading. Every CPU has its own thread. The other option is to divide a computation into several different processes. This makes each of the processes all simultaneously reliant on the same piece of data. This linkage calls upon processes to pass messages onto one another over any sort of reliable medium of communication. When processes on different nodes trade data with one another, it is formally known as message passing.
1.) Parallel Programming Paradigms
Parallel programming paradigms include two necessities. One is the successful utilization of CPUs in one process. The other is the contact between the nodes in order to take on associated parallel processes that are streaming on different nodes and trading mutually reliable data. A parallel program is normally comprised of a selection of processes that trade data with each other by communicating through mutual memory over a fabric of network interconnection. Parallel programs that enforce several different CPUs to communicate with one another through mutual memory utilize the OpenMP interface most of the time. The independent programs streaming on multiple CPUs within one node are often known as threads, as mentioned before. Programs that happen to utilize attentively coded hybrid processes have the potential to contain an extremely high performance with very high efficiency. These hybrid operations utilize both OpenMP and MPI.
Watch the clip below to learn more about parallel computing!