True parallelism can be achieved via a wide variety of avenues. Instruction-level parallelism arguably offered a huge value at one point when processor architectures bulked up to execute machine instructions in parallel whenever possible including when instructions can be reordered or delay slots can be filled. All this could have been done without an application programmer's involvement or knowledge, hence it promised the benefits of parallelism for most programs out-of-the-box without modifications. The onus, instead, was on the processor architects and compiler developers to take advantage of such parallelism, which sometimes turned out to be a tall order. Task parallelism, in contrast, typically demands a lot more from application programmers since exposing and taking advantage of parallelism becomes very domain-specific. The focus of parallelism for task parallelism depends on balancing the loads of execution threads. Runtime support can help fill in the gaps here by applying advanced scheduling techniques such as work stealing, but ultimately what constitutes a thread (i.e., how to partition the program) and how to mediate communication and sharing between threads is often up to the programmer. Another way to achieve parallelism at a fine-grain level is to implement and expose a parallel data type in a language, runtime, or distributed computing framework. Apache Spark achieves this using its RDD (Resilient Distributed Dataset) abstraction which can copy data from Scala Seq, Java Collection, and Python iterables. Hadoop uses Java Iterable for MapReduce. For modern distributed computing environments, the pipeline itself might be any DAG (more general than map-reduce), but ultimately parallelism stems from the data representation, hence data parallelism. Note that of all the forms of parallelism it is data parallelism that ultimately has demonstrated straightforward and efficient scaling to truly large problems.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.