SlideShare a Scribd company logo
Professional Parallel Programming With C Master
Parallel Extensions With Net 4 Gaston Hillar
download
https://ebookbell.com/product/professional-parallel-programming-
with-c-master-parallel-extensions-with-net-4-gaston-
hillar-2310226
Explore and download more ebooks at ebookbell.com
Here are some recommended products that we believe you will be
interested in. You can click the link to download.
Patterns For Parallel Programming Mattson Timothy Gsanders Beverly
Amassingill
https://ebookbell.com/product/patterns-for-parallel-programming-
mattson-timothy-gsanders-beverly-amassingill-21355032
Parallel And Distributed Programming Using C 1st Edition Cameron
Hughes
https://ebookbell.com/product/parallel-and-distributed-programming-
using-c-1st-edition-cameron-hughes-977568
Modern Computational Finance Aad And Parallel Simulations With
Professional Implementation In C Antoine Savine
https://ebookbell.com/product/modern-computational-finance-aad-and-
parallel-simulations-with-professional-implementation-in-c-antoine-
savine-7260582
How To Grow Organic Plants Indoors With Led Lights For Beginners And
Advanced Gardeners Advice From A Professional Grower Ryan Crippen
https://ebookbell.com/product/how-to-grow-organic-plants-indoors-with-
led-lights-for-beginners-and-advanced-gardeners-advice-from-a-
professional-grower-ryan-crippen-11182164
How To Grow Organic Plants Indoors With Led Lights For Beginners And
Advanced Gardeners Advice From A Professional Grower Ryan Crippen
https://ebookbell.com/product/how-to-grow-organic-plants-indoors-with-
led-lights-for-beginners-and-advanced-gardeners-advice-from-a-
professional-grower-ryan-crippen-22335662
Professional Selling 1st Edition Deeter Hunter Loe Rich Mullins
https://ebookbell.com/product/professional-selling-1st-edition-deeter-
hunter-loe-rich-mullins-44874378
Professional Lamp Linux Apache Mysql And Php Web Development Jason
Gerner
https://ebookbell.com/product/professional-lamp-linux-apache-mysql-
and-php-web-development-jason-gerner-46096180
Professional Communication At Work Interpersonal Strategies For Career
Success Joseph L Chesebro
https://ebookbell.com/product/professional-communication-at-work-
interpersonal-strategies-for-career-success-joseph-l-chesebro-46779314
Professional Identity Crisis Andrea Tomo
https://ebookbell.com/product/professional-identity-crisis-andrea-
tomo-46835298
Professional Parallel Programming With C Master Parallel Extensions With Net 4 Gaston Hillar
Professional Parallel Programming With C Master Parallel Extensions With Net 4 Gaston Hillar
Demonstration copy of activePDF Toolkit (http://www.activepdf.com)
PROFESSIONAL
PARALLEL PROGRAMMING WITH C#
FOREWORD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix
INTRODUCTION. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi
CHAPTER 1 Task-Based Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1
CHAPTER 2 Imperative Data Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
CHAPTER 3 Imperative Task Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
CHAPTER 4 Concurrent Collections. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
CHAPTER 5 Coordination Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
CHAPTER 6 PLINQ: Declarative Data Parallelism. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
CHAPTER 7 Visual Studio 2010 Task Debugging Capabilities . . . . . . . . . . . . . . . . . 275
CHAPTER 8 Thread Pools. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .317
CHAPTER 9 Asynchronous Programming Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
CHAPTER 10 Parallel Testing and Tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
CHAPTER 11 Vectorization, SIMD Instructions, and Additional Parallel Libraries . . 443
APPENDIX A .NET 4 Parallelism Class Diagrams. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
APPENDIX B Concurrent UML Models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
APPENDIX C Parallel Extensions Extras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497
INDEX. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .521
Professional Parallel Programming With C Master Parallel Extensions With Net 4 Gaston Hillar
PROFESSIONAL
Parallel Programming with C#
MASTER PARALLEL EXTENSIONS WITH .NET 4
Gastón C. Hillar
Professional Parallel Programming with C#
Published by
Wiley Publishing, Inc.
10475 Crosspoint Boulevard
Indianapolis, IN 46256
www.wiley.com
Copyright © 2011 by Wiley Publishing, Inc., Indianapolis, Indiana
Published by Wiley Publishing, Inc., Indianapolis, Indiana
Published simultaneously in Canada
ISBN: 978-0-470-49599-5
ISBN: 978-1-118-02812-4 (ebk)
ISBN: 978-1-118-02977-0 (ebk)
ISBN: 978-1-118-02978-7 (ebk)
Manufactured in the United States of America
10 9 8 7 6 5 4 3 2 1
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means,
electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108
of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization
through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers,
MA 01923, (978) 750-8400, fax (978) 646-8600. Requests to the Publisher for permission should be addressed to the
Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-
6008, or online at http://www.wiley.com/go/permissions.
Limit of Liability/Disclaimer of Warranty: The publisher and the author make no representations or warranties with
respect to the accuracy or completeness of the contents of this work, and specifically disclaim all warranties, including,
without limitation, warranties of fitness for a particular purpose. No warranty may be created or extended by sales or
promotional materials. The advice and strategies contained herein may not be suitable for every situation. This work is
sold with the understanding that the publisher is not engaged in rendering legal, accounting, or other professional ser-
vices. If professional assistance is required, the services of a competent professional person should be sought. Neither the
publisher nor the author shall be liable for damages arising herefrom. The fact that an organization or Web site is referred
to in this work as a citation and/or a potential source of further information does not mean that the author or the pub-
lisher endorses the information the organization or website may provide or recommendations it may make. Further, read-
ers should be aware that Internet websites listed in this work may have changed or disappeared between when this work
was written and when it is read.
For general information on our other products and services, please contact our Customer Care Department within the
United States at (877) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available
in electronic books.
Library of Congress Control Number: 2010930961
Trademarks: Wiley, the Wiley logo, Wrox, the Wrox logo, and Wrox Programmer to Programmer are trademarks or regis-
tered trademarks of John Wiley & Sons, Inc. and/or its affiliates, in the United States and other countries, and may not be
used without written permission. All other trademarks are the property of their respective owners. Wiley Publishing, Inc.
is not associated with any product or vendor mentioned in this book.
To my wonderful wife, Vanesa, who has somehow
learned to put up with marathon writing sessions. And
to my loving son, Kevin, who always managed to put
a smile on my face after a long day.
CREDITS
ACQUISITIONS EDITOR
Paul Reese
PROJECT EDITORS
Ed Connor
Ginny Munroe
TECHNICAL EDITOR
Doug Parsons
PRODUCTION EDITOR
Kathleen Wisor
COPY EDITOR
Kathryn Duggan
EDITORIAL DIRECTOR
Robyn B. Siesky
EDITORIAL MANAGER
Mary Beth Wakefield
FREELANCER EDITORIAL MANAGER
Rosemarie Graham
ASSOCIATE DIRECTOR OF MARKETING
David Mayhew
PRODUCTION MANAGER
Tim Tate
VICE PRESIDENT AND EXECUTIVE GROUP
PUBLISHER
Richard Swadley
VICE PRESIDENT AND EXECUTIVE PUBLISHER
Barry Pruett
ASSOCIATE PUBLISHER
Jim Minatel
PROJECT COORDINATOR, COVER
Katie Crocker
PROOFREADER
Paul Sagan, Word One New York
INDEXER
Robert Swanson
COVER IMAGE
© David Marchal/istockphoto.com
COVER DESIGNER
Michael Trent
ABOUT THE AUTHOR
GASTÓN C. HILLAR has been working with computers since he was eight. He
began programming with the legendary Texas TI-99/4A and Commodore
64 home computers in the early ’80s. He received a bachelor’s degree from
UADE University, where he graduated with honors, and a Master of Business
Administration from UCEMA University, where he graduated with an out-
standing thesis.
Gastón has been researching parallel programming, multiprocessor, and multicore since 1997. He
has 14 years of experience designing and developing diverse types of complex parallelized solutions
that take advantage of multicore with C# and .NET Framework. He has been working with Parallel
Extensions since the first Community Technology Preview (CTP). He is heavily involved with con-
sulting, training, and development on the .NET platform, focusing on creating efficient software for
modern hardware. He regularly speaks on software development at conferences all over the world.
In 2009, he was awarded an Intel®
Black Belt Software Developer award.
Gastón has written four books in English, contributed chapters to two other books, and has written
more than 40 books in Spanish. He contributes to Dr Dobb’s at www.drdobbs.com and Dr. Dobb’s
Go Parallel programming portal at www.ddj.com/go-parallel/, and is a guest blogger at Intel
Software Network (http://software.intel.com).
He has worked as a developer, architect, and project manager for many companies in Buenos
Aires, Argentina. Now, he is an independent IT consultant working for several American, German,
Spanish, and Latin American companies, and a freelance author. He is always looking for new
adventures around the world.
He lives with his wife, Vanesa, and his son, Kevin. When not tinkering with computers, he enjoys
developing and playing with wireless virtual reality devices and electronics toys with his father, his
son, and his nephew, Nico.
You can reach him at: gastonhillar@hotmail.com and follow him on Twitter at: http://
twitter.com/gastonhillar. Gastón’s blog is at http://csharpmulticore.blogspot.com.
ABOUT THE TECHNICAL EDITOR
DOUG PARSONS is a software architect and the director of Ohio Operations for NJI New Media.
His expertise is in web development with a specialization in political websites. Most notably, he has
worked on the 2008 John McCain presidential campaign website, and more recently, he has worked
on Mitt Romney’s official book tour website. In his down time, he enjoys spending time with his
lovely fiancée, Marisa, and their puppies.
Professional Parallel Programming With C Master Parallel Extensions With Net 4 Gaston Hillar
ACKNOWLEDGMENTS
PARALLEL PROGRAMMING IS ONE of the most difficult topics to write about. It is usually difficult to
isolate subjects without having to reference many closely related topics. However, I had a lot of help
during all the necessary stages to produce a high-quality book on a very challenging topic.
Special thanks go to Paul Reese, Edward Connor, Ginny Munroe, and Rosemarie Graham — they
had a lot of patience, and they allowed me to make the necessary changes to the chapters in order
to include the most accurate and appropriate information. The book required a lot of work, and
they understood that writing an advanced book about parallel programming is a bit different from
writing books about other programming topics. They worked very hard to make this book possible.
In addition, I must thank Doug Parsons and Kathryn Duggan. You will notice their improvements
when you read each sentence. They allowed me to convert a draft into a proof with their valuable
feedback.
I wish to acknowledge Stephen Toub, Principal Program Manager of the Parallel Computing
Platform team at Microsoft, who provided me with invaluable feedback for all the chapters. I was
able to improve the examples and the contents by incorporating Stephen’s insightful comments. This
book would have been very difficult to finish without Stephen’s help. His team’s blog is at http://
blogs.msdn.com/b/pfxteam/. The blog is an essential resource for keeping up-to-date with Parallel
Extensions improvements and usage.
I must also thank Daniel Moth, member of the Microsoft Technical Computing group. Daniel
helped me to improve the chapter that covers the new and exciting debugging features included in
Visual Studio 2010. His feedback allowed me to incorporate a fascinating chapter into this book.
Special thanks go to Aaron Tersteeg and Kathy Farrel, managers of the excellent Parallel Programming
community at Intel Software Network. I had the opportunity to enrich my knowledge in parallel
computing topics through this great community. I wouldn’t have been able to write this book without
listening to and watching the Parallel Programming Talk shows (www.intel.com/software/
parallelprogrammingtalk) that kept me up-to-date with the parallel computing trends.
Some of the information in this book is the result of intensive discussions I had at the Intel Black
Belt Annual Meetups — I would like to acknowledge James Reinders, Dr. Clay Breshears, Jim
Dempsey, and Doug Holland for sharing their wisdom. Doug also shares with me a passion for
.NET, and I learned a great deal about his first experiences with Parallel Extensions via his blog: An
Architect’s Perspective (http://blogs.msdn.com/b/dohollan/).
I must also thank Jon Erickson, editor of the Dr. Dobb’s website (www.drdobbs.com). Jon gave me the
opportunity to contribute to both Dr. Dobb’s and Dr. Dobb’s Go Parallel (www.ddj.com/go-parallel/)
in order to share my experience with other developers and architects. This book incorporates the great
feedback received from my contributions.
I wish to acknowledge Hector A. Algarra, who always helped me to improve my writing skills.
x
ACKNOWLEDGMENTS
Special thanks go to my wife, Vanesa S. Olsen, my son, Kevin, my nephew, Nicolas, my father,
Jose Carlos, my sister, Silvina, and my mother, Susana. They were my greatest supporters during
the production of this book.
And finally, thanks to all of you for selecting this book. I hope the parallel programming knowledge
that you gain from it will help you develop powerful, high-performance applications, and responsive
user interfaces.
CONTENTS
FOREWORD xix
INTRODUCTION xxi
CHAPTER 1: TASK-BASED PROGRAMMING 1
Working with Shared-Memory Multicore 2
Differences Between Shared-Memory Multicore and
Distributed-Memory Systems 3
Parallel Programming and Multicore Programming 4
Understanding Hardware Threads and
Software Threads 5
Understanding Amdahl’s Law 10
Considering Gustafson’s Law 13
Working with Lightweight Concurrency 16
Creating Successful Task-Based Designs 17
Designing With Concurrency in Mind 18
Understanding the Differences between Interleaved Concurrency,
Concurrency, and Parallelism 19
Parallelizing Tasks 19
Minimizing Critical Sections 21
Understanding Rules for Parallel Programming for Multicore 22
Preparing for NUMA and Higher Scalability 22
Deciding the Convenience of Going Parallel 27
Summary 28
CHAPTER 2: IMPERATIVE DATA PARALLELISM 29
Launching Parallel Tasks 30
System.Threading.Tasks.Parallel Class 31
Parallel.Invoke 32
No Specific Execution Order 33
Advantages and Trade-Offs 37
Interleaved Concurrency and Concurrency 38
Transforming Sequential Code to Parallel Code 40
Detecting Parallelizable Hotspots 40
Measuring Speedups Achieved by Parallel Execution 43
Understanding the Concurrent Execution 45
Parallelizing Loops 45
xii
CONTENTS
Parallel.For 46
Refactoring an Existing Sequential Loop 48
Measuring Scalability 50
Working with Embarrassingly Parallel Problems 52
Parallel.ForEach 52
Working with Partitions in a Parallel Loop 54
Optimizing the Partitions According to the Number of Cores 56
Working with IEnumerable Sources of Data 58
Exiting from Parallel Loops 60
Understanding ParallelLoopState 62
Analyzing the Results of a Parallel Loop Execution 63
Catching Exceptions that Occur Inside Parallel Loops 64
Specifying the Desired Degree of Parallelism 66
ParallelOptions 66
Counting Hardware Threads 69
Logical Cores Aren’t Physical Cores 70
Using Gantt Charts to Detect Critical Sections 71
Summary 72
CHAPTER 3: IMPERATIVE TASK PARALLELISM 73
Creating and Managing Tasks 74
System.Theading.Tasks.Task 75
Understanding a Task’s Status and Lifecycle 77
TaskStatus: Initial States 77
TaskStatus: Final States 78
Using Tasks to Parallelize Code 78
Starting Tasks 79
Visualizing Tasks Using Parallel Tasks and Parallel Stacks 80
Waiting for Tasks to Finish 85
Forgetting About Complex Threads 85
Cancelling Tasks Using Tokens 86
CancellationTokenSource 89
CancellationToken 89
TaskFactory 90
Handling Exceptions Thrown by Tasks 91
Returning Values from Tasks 92
TaskCreationOptions 95
Chaining Multiple Tasks Using Continuations 95
Mixing Parallel and Sequential Code with Continuations 97
Working with Complex Continuations 97
TaskContinuationOptions 98
xiii
CONTENTS
Programming Complex Parallel Algorithms with
Critical Sections Using Tasks 100
Preparing the Code for Concurrency and Parallelism 101
Summary 101
CHAPTER 4: CONCURRENT COLLECTIONS 103
Understanding the Features Offered by
Concurrent Collections 104
System.Collections.Concurrent 107
ConcurrentQueue 107
Understanding a Parallel Producer-Consumer Pattern 111
Working with Multiple Producers and Consumers 115
Designing Pipelines by Using Concurrent Collections 120
ConcurrentStack 121
Transforming Arrays and Unsafe Collections into
Concurrent Collections 128
ConcurrentBag 129
IProducerConsumerCollection 136
BlockingCollection 137
Cancelling Operations on a BlockingCollection 142
Implementing a Filtering Pipeline with Many
BlockingCollection Instances 144
ConcurrentDictionary 150
Summary 155
CHAPTER 5: COORDINATION DATA STRUCTURES 157
Using Cars and Lanes to Understand the Concurrency Nightmares 158
Undesired Side Effects 158
Race Conditions 159
Deadlocks 160
A Lock-Free Algorithm with Atomic Operations 161
A Lock-Free Algorithm with Local Storage 162
Understanding New Synchronization Mechanisms 163
Working with Synchronization Primitives 164
Synchronizing Concurrent Tasks with Barriers 165
Barrier and ContinueWhenAll 171
Catching Exceptions in all Participating Tasks 172
Working with Timeouts 173
Working with a Dynamic Number of Participants 178
Working with Mutual-Exclusion Locks 179
Working with Monitor 182
D
ownload
from
Wow!
eBook
<www.wowebook.com>
xiv
CONTENTS
Working with Timeouts for Locks 184
Refactoring Code to Avoid Locks 187
Using Spin Locks as Mutual-Exclusion Lock Primitives 190
Working with Timeouts 193
Working with Spin-Based Waiting 194
Spinning and Yielding 197
Using the Volatile Modifier 200
Working with Lightweight Manual Reset Events 201
Working with ManualResetEventSlim to Spin and Wait 201
Working with Timeouts and Cancellations 206
Working with ManualResetEvent 210
Limiting Concurrency to Access a Resource 211
Working with SemaphoreSlim 212
Working with Timeouts and Cancellations 216
Working with Semaphore 216
Simplifying Dynamic Fork and Join Scenarios with CountdownEvent 219
Working with Atomic Operations 223
Summary 228
CHAPTER 6: PLINQ: DECLARATIVE DATA PARALLELISM 229
Transforming LINQ into PLINQ 230
ParallelEnumerable and Its AsParallel Method 232
AsOrdered and the orderby Clause 233
Specifying the Execution Mode 237
Understanding Partitioning in PLINQ 237
Performing Reduction Operations with PLINQ 242
Creating Custom PLINQ Aggregate Functions 245
Concurrent PLINQ Tasks 249
Cancelling PLINQ 253
Specifying the Desired Degree of Parallelism 255
WithDegreeOfParallelism 255
Measuring Scalability 257
Working with ForAll 259
Differences Between foreach and ForAll 261
Measuring Scalability 261
Configuring How Results Are Returned
by Using WithMergeOptions 264
Handling Exceptions Thrown by PLINQ 266
Using PLINQ to Execute MapReduce Algorithms 268
Designing Serial Stages Using PLINQ 271
Locating Processing Bottlenecks 273
Summary 273
xv
CONTENTS
CHAPTER 7: VISUAL STUDIO 2010 TASK DEBUGGING CAPABILITIES 275
Taking Advantage of Multi-Monitor Support 275
Understanding the Parallel Tasks Debugger Window 279
Viewing the Parallel Stacks Diagram 286
Following the Concurrent Code 294
Debugging Anonymous Methods 304
Viewing Methods 305
Viewing Threads in the
Source Code 307
Detecting Deadlocks 310
Summary 316
CHAPTER 8: THREAD POOLS 317
Going Downstairs from the Tasks Floor 317
Understanding the New CLR 4 Thread Pool Engine 319
Understanding Global Queues 319
Waiting for Worker Threads to Finish Their Work 329
Tracking a Dynamic Number of Worker Threads 336
Using Tasks Instead of Threads to Queue Jobs 340
Understanding the Relationship Between Tasks
and the Thread Pool 343
Understanding Local Queues and the Work-Stealing Algorithm 347
Specifying a Custom Task Scheduler 353
Summary 359
CHAPTER 9: ASYNCHRONOUS PROGRAMMING MODEL 361
Mixing Asynchronous Programming with Tasks 362
Working with TaskFactory.FromAsync 363
Programming Continuations After Asynchronous Methods End 368
Combining Results from Multiple Concurrent
Asynchronous Operations 369
Performing Asynchronous WPF UI Updates 371
Performing Asynchronous Windows Forms UI Updates 379
Creating Tasks that Perform EAP Operations 385
Working with TaskCompletionSource 394
Summary 398
CHAPTER 10: PARALLEL TESTING AND TUNING 399
Preparing Parallel Tests 399
Working with Performance Profiling Features 404
Measuring Concurrency 406
xvi
CONTENTS
Solutions to Common Patterns 416
Serialized Execution 416
Lock Contention 419
Lock Convoys 420
Oversubscription 423
Undersubscription 426
Partitioning Problems 428
Workstation Garbage-Collection Overhead 431
Working with the Server Garbage Collector 434
I/O Bottlenecks 434
Main Thread Overload 435
Understanding False Sharing 438
Summary 441
CHAPTER 11: VECTORIZATION, SIMD INSTRUCTIONS, AND
ADDITIONAL PARALLEL LIBRARIES 443
Understanding SIMD and Vectorization 443
From MMX to SSE4.x and AVX 446
Using the Intel Math Kernel Library 447
Working with Multicore-Ready, Highly Optimized Software Functions 455
Mixing Task-Based Programming with External Optimized Libraries 456
Generating Pseudo-Random Numbers in Parallel 457
Using Intel Integrated Performance Primitives 461
Summary 468
APPENDIX A: .NET 4 PARALLELISM CLASS DIAGRAMS 469
Task Parallel Library 469
System.Threading.Tasks.Parallel Classes and Structures 469
Task Classes, Enumerations, and Exceptions 471
Data Structures for Coordination in Parallel Programming 472
Concurrent Collection Classes: System.Collections.Concurrent 474
Lightweight Synchronization Primitives 476
Lazy Initialization Classes 477
PLINQ 477
Threading 479
Thread and ThreadPool Classes and Their Exceptions 479
Signaling Classes 479
Threading Structures, Delegates, and Enumerations 480
BackgroundWorker Component 486
xvii
CONTENTS
APPENDIX B: CONCURRENT UML MODELS 487
Structure Diagrams 487
Class Diagram 487
Component Diagram 489
Deployment Diagram 489
Package Diagram 489
Behavior Diagrams 489
Activity Diagram 491
Use Case Diagram 491
Interaction Diagrams 493
Interaction Overview Diagram 493
Sequence Diagram 494
APPENDIX C: PARALLEL EXTENSIONS EXTRAS 497
Inspecting Parallel Extensions Extras 497
Coordination Data Structures 502
Extensions 507
Parallel Algorithms 513
Partitioners 516
Task Schedulers 517
INDEX 521
Professional Parallel Programming With C Master Parallel Extensions With Net 4 Gaston Hillar
FOREWORD
aParllel prgoamrmnig s i ahdr. Hmm, let me try that again. Parallel programming is hard.
While truly a silly example, my first sentence exemplifies some of the difficulties we, as developers,
face while writing multithreaded code. As I write this foreword, my hands typing on my laptop
are effectively two distinct, physical processes, and if you further consider each of my fingers as an
individual entity, the count would instead be 10. I’m generally acknowledged to be a fast typist, and
in order to type at over 100 words per minute, my brain manages to coordinate all of my fingers,
flowing them concurrently toward their next targets, yet still (most of the time) ensuring that their
output is correctly serialized according to the spelling of the words my mind is willing my hands
to render. I deliberately suspended that coordination to deliver that first sentence, such that my
hands were no longer synchronizing correctly. The result is a barely readable representation of my
thoughts. Luckily, it was easily debugged.
Parallel programming is indeed hard, or at least it has been historically. With the tools that have
been available, only a small percentage of software developers have been proficient enough in the art
to successfully develop and debug multithreaded applications. And yet, since the advent of modern
computers, developers who need to write responsive user interfaces, build scalable services, or take
advantage of multiple processing cores for performance have been forced to deal with concurrency,
forced to develop software at the level of threads, mutexes, and semaphores. The difficulties here
abound: oversubscription, race conditions, deadlocks, live locks, two-step dances, priority inver-
sions, lock convoys, contention, and false sharing, just to name a few.
With all of these complications and with the recent industry shift toward multicore and manycore,
parallel programming has received a lot of attention from companies that build development plat-
forms, with Microsoft chief among them. Several years ago, the Parallel Computing Platform team
at Microsoft emerged with a clear vision and purpose: to make building parallelized software easier.
Developers should be able to easily express the parallelism that exists in their applications and allow
the underlying framework, run-time, and operating system to implement that parallelism for them,
mapping the expressed parallelism down to the hardware for correct and efficient execution. The
first wave of supported components from the Parallel Computing Platform team was released in
April 2010 as part of Visual Studio 2010; whether you’re using native or managed code, this release
provides foundational work to simplify the development of parallel applications. For developers
using managed code, this includes the Task Parallel Library, Parallel LINQ, the new Parallel Stacks
and Parallel Tasks debugging windows, a Concurrency Visualizer that yields deep insights into the
execution of your multithreaded applications, and more.
Even with all of this new support, parallel programming still requires in-depth knowledge. In an age
in which communication abounds in the form of 140-character quips, I personally find there’s no bet-
ter medium for conveying that in-depth knowledge than in a quality book. Luckily, you’re reading
one right now. Here, Gastón Hillar delivers a comprehensive book that covers all aspects of devel-
oping parallel applications with Visual Studio 2010 and the .NET Framework 4. From task-based
programming to data parallelism to managing shared state to debugging parallel programs, and
from the Task Parallel Library to Parallel LINQ to the ThreadPool to new coordination and syn-
chronization primitives, Gastón provides a welcome and in-depth tour through the vast support for
parallel programming that now exists in .NET Framework 4 and Visual Studio 2010.
This book contains information that can provide you with solid foundational knowledge you’ll want
when developing parallel applications. Congratulations on taking your first steps into this brave new
manycore world.
—Stephen Toub
Principal Program Manager
Parallel Computing Platform
Microsoft Corporation
Redmond, WA
September 2010
FOREWORD
INTRODUCTION
In 2007, Microsoft released the first Community Technology Preview (CTP) of Parallel Extensions
for the .NET Framework. The old .NET Framework multithreading programming model was too
complex and heavyweight for the forthcoming multicore and manycore CPUs. I had been research-
ing parallel programming, multiprocessor, and multicore since 1997, so I couldn’t help installing
the first CTP and trying it. It was obvious that it was going to be an exciting new way of expressing
parallelism in future C# versions.
Visual Studio 2010 ships with version 4 of the .NET Framework, the first release to include Parallel
Extensions. C# 4 and .NET Framework 4 allow you to shift to a modern task-based programming
model to express parallelism. It is easier to write code that takes advantage of multicore micropro-
cessors. Now, you can write code that scales as the number of available cores increases, without
having to work with complex managed threads. You are able to write code that runs tasks, and the
Common Language Runtime (CLR) will inject the necessary threads for you. It is easy to run data
parallelism algorithms taking advantage of multicore.
At the time of this writing, multicore microprocessors are everywhere. Servers, desktop computers,
laptops and notebooks, netbooks, mobile Internet devices (MIDs), tablets, and even smartphones
use multicore microprocessors. The average number of cores in each microprocessor is going to
increase in the forthcoming years. Are you going to lose the opportunity to transform this multicore
power into application performance?
Parallel programming must become part of your skill set to effectively develop applications for modern
hardware in C#. I spent more than three years working with the diverse versions of Parallel Extensions
until Visual Studio 2010 was officially released. I enjoyed developing parallelized applications with C#,
and I did my best to include explanations for the most common scenarios in this book.
Visual Studio 2010 provides an IDE prepared for a parallel developer, and C# is an excellent fit for
the new task-based programming model.
WHO THIS BOOK IS FOR
This book was written to help experienced C# developers transform the multicore power found in
modern microprocessors into application performance by using the Parallel Extensions introduced
in .NET Framework 4. For those who are just starting the transition from the previous multithread-
ing model to those who have worked with concurrent and parallel code for a while and need to gain
a deeper understanding, this book provides information on the most common parallel programming
skills and concepts you need.
This book offers a wide-ranging presentation of parallel programming concepts, but parallel pro-
gramming possibilities with C# and .NET Framework 4 are so large and comprehensive that no
single book can cover them all. The goal of this book is to provide a working knowledge of key
technologies that are important to C# developers who want to take advantage of multicore and
xxii
INTRODUCTION
manycore architectures. It provides adequate knowledge for an experienced C# developer to work in
many high-level parallelism areas. The book covers the new task-based programming model. Some
developers who are interested in distributed parallelism and low-level concurrent programming top-
ics may choose to add to their knowledge by supplementing this book with other books dedicated
entirely to these technology areas.
This book provides background information that is very important to avoid common paral-
lel programming pitfalls; therefore, it is best to read it without skipping chapters. Moreover, you
should finish reading a chapter before considering the code shown in the middle of that chapter as
a best practice. As each chapter introduces new features for Parallel Extensions, the examples are
enhanced with simpler and more efficient coding techniques.
This book assumes that you are an experienced C# and .NET Framework 4 developer and
focuses on parallel programming topics. If you don’t have experience with advanced object-
oriented programming, lists, arrays, closures, delegates, lambda expressions, LINQ, typecast-
ing, and basic debugging techniques, you may need additional training to fully understand the
examples shown.
WHAT THIS BOOK COVERS
This book covers the following key technologies and concepts:
‰ Modern multicore and manycore shared-memory architectures
‰ High-level, task-based programming with Task Parallel Library (TPL), C#, and .NET
Framework 4
‰ Parallel Language Integrated Query (PLINQ)
‰ Most common coordination data structures and synchronization primitives for task-based
programming
‰ Visual Studio 2010 debugging and profiling capabilities related to parallel programming
‰ Additional libraries, tools, and extras that help you master multicore programming in real-
life applications
This book does not cover the old multithreaded programming model or distributed parallelism.
HOW THIS BOOK IS STRUCTURED
It is critical to master certain topics first. Unless you have previous experience with the new task-
based programming model introduced in .NET Framework 4, you should read the book chapter
by chapter. Each chapter was written with the assumption that you have read the previous chapter.
However, if you have previously worked with TPL and PLINQ, you will be able to read and under-
stand the content included in the chapters related to parallel debugging and tuning.
xxiii
INTRODUCTION
The book is divided into the following 11 chapters and three appendixes:
Chapter 1, “Task-Based Programming” — Explore the new task-based programming model
that allows you to introduce parallelism in .NET Framework 4 applications. Parallelism is
essential to exploit modern multicore and manycore architectures. This chapter describes
the new lightweight concurrency models and important concepts related to concurrency
and parallelism. It is important to read this chapter, because it includes the necessary
background information in order to prepare your mind for the next 10 chapters and three
appendixes.
Chapter 2, “Imperative Data Parallelism” — Start learning the new programming models
introduced in C# 4 and .NET Framework 4 and apply them with pure data parallel prob-
lems. This chapter is about some of the new classes, structures, and enumerations that allow
you to deal with data parallelism scenarios. Run the examples to understand the perfor-
mance improvements.
Chapter 3, “Imperative Task Parallelism” — Start working with the new Task instances to
solve imperative task parallelism problems and complex algorithms with simple code. This
chapter is about the new classes, structures, and enumerations that allow you to deal with
imperative task parallelism scenarios. Implement existing algorithms in parallel using basic
and complex features offered by the new task-based programming model. Create parallel
code using tasks instead of threads.
Chapter 4, “Concurrent Collections” — Task-based programming, imperative data, and
task parallelism require arrays, lists, and collections capable of supporting updates concur-
rently. Work with the new concurrent collections to simplify the code and to achieve the
best performance. This chapter is about the new classes and the new interface that allows
you to work with shared concurrent collections from multiple tasks. It explains how to
create parallel code that adds, removes, and updates values of different types in lists with
diverse ordering schemes and structures.
Chapter 5, “Coordination Data Structures” — Synchronize the work performed by diverse
concurrent tasks. This chapter covers some classic synchronization primitives and the new
lightweight coordination data structures introduced in .NET Framework 4. It is important
to learn the different alternatives, so that you can choose the most appropriate one for each
concurrency scenario that requires communication and/or synchronization between multi-
ple tasks. This is the most complex chapter in the book and one of the most important ones.
Be sure to read it before writing complex parallelized algorithms.
Chapter 6, “PLINQ: Declarative Data Parallelism” — Work with declarative data parallel-
ism using Parallel Language Integrated Query (PLINQ) and its aggregate functions. You can
use PLINQ to simplify the code that runs a mix of task and data decomposition. You can
also execute the classic parallel Map Reduce algorithm. This chapter combines many of the
topics introduced in previous chapters and explains how to transform a LINQ query into a
PLINQ query. In addition, the chapter teaches different techniques to tune PLINQ’s parallel
execution according to diverse scenarios.
Chapter 7, “Visual Studio 2010 Task Debugging Capabilities” — Take full advantage of
the new task debugging features introduced in Visual Studio 2010. This chapter describes
how the new windows display important information about the tasks and their relationships
with the source code and the threads assigned to support their execution. Use these new
xxiv
INTRODUCTION
windows to detect and solve potential bugs when working with parallelized code in .NET
Framework 4.
Chapter 8, “Thread Pools” — Understand the differences between using tasks and
directly requesting work items to run in threads in the thread pool. If you work with a
thread pool, you can take advantage of the new improvements and move your code to a
task-based programming model. This chapter is about the changes in the CLR thread pool
engine introduced in .NET Framework 4 and provides an example of a customized task
scheduler.
Chapter 9, “Asynchronous Programming Model” — Leverage the advantages of mixing
the existing asynchronous programming models with tasks. This chapter provides real-
life examples that take advantage of the simplicity of tasks and continuations to perform
concurrent asynchronous jobs related to the existing asynchronous programming models.
In addition, the chapter teaches one of the most complex topics related to concurrent pro-
gramming: the process of updating the User Interface (UI) from diverse tasks and threads.
The chapter explains patterns to update the UI in both Windows Forms and Windows
Presentation Foundation (WPF) applications.
Chapter 10, “Parallel Testing and Tuning” — Leverage the new concurrency profiling
features introduced in Visual Studio 2010 Premium and Ultimate editions. It is very impor-
tant to learn the common, easy-to-detect problems related to parallelized code with .NET
Framework 4. This chapter explains the different techniques used to create and run parallel
tests and benchmarks. It also teaches you to refactor an existing application according to
the results of each profiling session.
Chapter 11, “Vectorization, SIMD Instructions, and Additional Parallel Libraries” — Take
advantage of other possibilities offered by modern hardware related to parallelism. .NET
Framework 4 does not offer direct support to SIMD or vectorization. However, most
modern microprocessors provide these powerful additional instructions. Thus, you can
use libraries optimized to take advantage of the performance improvements provided by
these instructions. This chapter explains how to integrate Intel Math Kernel Library into
task-based programming code using C#. In addition, it explains how to optimize critical
sections using Intel Integrated Performance Primitives.
Appendix A, “.NET 4 Parallelism Class Diagrams” — This appendix includes diagrams for
the classes, interfaces, structures, delegates, enumerations, and exceptions that support par-
allelism with the new lightweight concurrency model and the underlying threading model.
There are also references to the chapters that explain the contents of these diagrams in more
detail.
Appendix B, “Concurrent UML Models” — This appendix gives you some examples of how
you can use UML models to represent designs and code prepared for concurrency. You can
extend the classic models by adding a few simple and standardized visual elements.
Appendix C, “Parallel Extensions Extras” — Parallel Extensions Extras is a complementary
project that isn’t part of the .NET Framework 4 classes, but was developed by Microsoft as
part of the parallel programming samples for .NET Framework 4. This appendix includes
diagrams and brief descriptions for the classes and structures that constitute the Parallel
Extensions Extras.
xxv
INTRODUCTION
WHAT YOU NEED TO USE THIS BOOK
To get the most out of this book, you’ll need Visual Studio 2010 Premium or Ultimate Edition,
which includes .NET Framework 4 and the concurrency profiling features. You may use Visual
Studio 2010 Standard Edition instead, but the concurrency profiling features aren’t available in this
edition. Nor should you use Visual C# 2010 Express Edition, because it doesn’t provide the neces-
sary debugging windows to work with task-based programming.
In addition, you’ll need at least two physical cores in your developer computer to understand the
examples shown in the book. However, if you want to test scalability, at least three physical cores is
a better option.
Windows 7 and Windows 2008 R2 introduced significant enhancements in their schedulers to
improve scalability in multicore and manycore systems. The book is based on applications running
on these Windows versions. If you work with previous Windows versions, the results might differ.
CONVENTIONS
To help you get the most from the text and keep track of what’s happening, we’ve used a number of
conventions throughout the book.
Notes, tips, hints, tricks, and asides to the current discussion are offset and
placed in italics like this.
As for styles in the text:
‰ We highlight new terms and important words when we introduce them.
‰ We show keyboard strokes like this: Ctrl+A.
‰ We show file names, URLs, and code within the text like so: persistence.properties.
‰ We present code in two different ways:
We use a monofont type with no highlighting for most code examples.
We use bold to emphasize code that’s particularly important in the present context.
SOURCE CODE
As you work through the examples in this book, you may choose either to type in all the code
manually or to use the source code files that accompany the book. All of the source code used in
this book is available for download at www.wrox.com. Once at the site, simply locate the book’s title
xxvi
INTRODUCTION
(either by using the Search box or by using one of the title lists) and click the Download Code link
on the book’s detail page to obtain all the source code for the book.
Because many books have similar titles, you may find it easiest to search by
ISBN. This book’s ISBN is 978-0-470-49599-5.
Once you download the code, just decompress it with your favorite compression tool. Alternately,
you can go to the main Wrox code download page at www.wrox.com/dynamic/books/download.
aspx to see the code available for this book and all other Wrox books.
ERRATA
We make every effort to ensure that there are no errors in the text or in the code. However, no one
is perfect, and mistakes do occur. If you find an error in one of our books, like a spelling mistake
or faulty piece of code, we would be very grateful for your feedback. By sending in errata, you may
save another reader hours of frustration, and at the same time, you will be helping us provide even
higher-quality information.
To find the errata page for this book, go to www.wrox.com and locate the title using the Search box or
one of the title lists. Then, on the book details page, click the Book Errata link. On this page, you can
view all errata that have been submitted for this book and posted by Wrox editors. A complete book
list, including links to each book’s errata, is also available at www.wrox.com/misc-pages/book-
list.shtml.
If you don’t spot “your” error on the Book Errata page, go to www.wrox.com/contact/techsupport
.shtml and complete the form there to send us the error you have found. We’ll check the informa-
tion and, if appropriate, post a message to the book’s errata page and fix the problem in subsequent
editions of the book.
P2P.WROX.COM
For author and peer discussion, join the P2P forums at p2p.wrox.com. The forums are a web-based
system for you to post messages relating to Wrox books and technologies and interact with other
readers and technology users. The forums offer a subscription feature to email you topics of inter-
est of your choosing when new posts are made to the forums. Wrox authors, editors, other industry
experts, and your fellow readers are present on these forums.
At http://p2p.wrox.com, you will find a number of different forums that will help you not only as
you read this book, but also as you develop your own applications. To join the forums, just follow
these steps:
xxvii
INTRODUCTION
1. Go to p2p.wrox.com and click the Register link.
2. Read the terms of use and click Agree.
3. Complete the required information to join as well as any optional information you wish to
provide and click Submit.
4. You will receive an email with information describing how to verify your account and
complete the joining process.
You can read messages in the forums without joining P2P, but in order to post
your own messages, you must join.
Once you join, you can post new messages and respond to messages other users post. You can read
messages at any time on the web. If you would like to have new messages from a particular forum
emailed to you, click the Subscribe To This Forum icon by the forum name in the forum listing.
For more information about how to use the Wrox P2P, be sure to read the P2P FAQs for answers to
questions about how the forum software works as well as many common questions specific to P2P
and Wrox books. To read the FAQs, click the FAQ link on any P2P page.
Professional Parallel Programming With C Master Parallel Extensions With Net 4 Gaston Hillar
1
Task-Based Programming
WHAT’S IN THIS CHAPTER?
‰ Working with shared-memory multicore
‰ Understanding the differences between shared-memory multicore
and distributed-memory systems
‰ Working with parallel programming and multicore programming in
shared-memory architectures
‰ Understanding hardware threads and software threads
‰ Understanding Amdahl’s Law
‰ Considering Gustafson’s Law
‰ Working with lightweight concurrency models
‰ Creating successful task-based designs
‰ Understanding the differences between interleaved concurrency,
concurrency, and parallelism
‰ Parallelizing tasks and minimizing critical sections
‰ Understanding rules for parallel programming for multicore
architectures
‰ Preparing for NUMA architectures
This chapter introduces the new task-based programming that allows you to introduce paral-
lelism in applications. Parallelism is essential to exploit modern shared-memory multicore
architectures. The chapter describes the new lightweight concurrency models and important
concepts related to concurrency and parallelisms. It includes the necessary background infor-
mation in order to prepare your mind for the next 10 chapters.
2 x CHAPTER 1 TASK-BASED PROGRAMMING
WORKING WITH SHARED-MEMORY MULTICORE
In 2005, Herb Sutter published an article in Dr. Dobb’s Journal titled “The Free Lunch Is Over: A
Fundamental Turn Toward Concurrency in Software” (www.gotw.ca/publications/
concurrency-ddj.htm). He talked about the need to start developing software considering concur-
rency to fully exploit continuing exponential microprocessors throughput gains. Microprocessor man-
ufacturers are adding processing cores instead of increasing their clock frequency. Software developers
can no longer rely on the free-lunch performance gains these increases in clock frequency provided.
Most machines today have at least a dual-core microprocessor. However, quad-core and octal-
core microprocessors, with four and eight cores, respectively, are quite popular on servers,
advanced workstations, and even on high-end mobile computers. More cores in a single
microprocessor are right around the corner. Modern microprocessors offer new multicore
architectures. Thus, it is very important to prepare the software designs and the code to exploit
these architectures. The different kinds of applications generated with Visual C# 2010 and
.NET Framework 4 run on one or many central processing units (CPUs), the main microproces-
sors. Each of these microprocessors can have a different number of cores, capable of executing
instructions.
You can think of a multicore microprocessor as many interconnected microprocessors in a single
package. All the cores have access to the main memory, as illustrated in Figure 1-1. Thus, this
architecture is known as shared-memory multicore. Sharing memory in this way can easily lead to
a performance bottleneck.
Core #0 Core #1 Core #2
Main memory (shared-memory)
Core #n
FIGURE 1-1
Working with Shared-Memory Multicore x 3
Multicore microprocessors have many different complex micro-architectures, designed to offer
more parallel-execution capabilities, improve overall throughput, and reduce potential bottlenecks.
At the same time, multicore microprocessors try to shrink power consumption and generate less
heat. Therefore, many modern microprocessors can increase or reduce the frequency for each core
according to their workload, and they can even sleep cores when they are not in use. Windows 7 and
Windows Server 2008 R2 support a new feature called Core Parking. When many cores aren’t in
use and this feature is active, these operating systems put the remaining cores to sleep. When these
cores are necessary, the operating systems wake the sleeping cores.
Modern microprocessors work with dynamic frequencies for each of their cores. Because the cores
don’t work with a fixed frequency, it is difficult to predict the performance for a sequence of instruc-
tions. For example, Intel Turbo Boost Technology increases the frequency of the active cores. The
process of increasing the frequency for a core is also known as overclocking.
If a single core is under a heavy workload, this technology will allow it to run at higher frequencies
when the other cores are idle. If many cores are under heavy workloads, they will run at higher
frequencies but not as high as the one achieved by the single core. The microprocessor cannot keep
all the cores overclocked a lot of time, because it consumes more power and its temperature increases
faster. The average clock frequency for all the cores under heavy workloads is going to be lower than
the one achieved for the single core. Therefore, under certain situations, some code can run at higher
frequencies than other code, which can make measuring real performance gains a challenge.
Differences Between Shared-Memory Multicore and
Distributed-Memory Systems
Distributed-memory computer systems are composed of many microprocessors with their own
private memory, as illustrated in Figure 1-2. Each microprocessor can be in a different computer,
with different types of communication channels between them. Examples of communication chan-
nels are wired and wireless networks. If a job running in one of the microprocessors requires remote
data, it has to communicate with the corresponding remote microprocessor through the communi-
cation channel. One of the most popular communications protocols used to program parallel appli-
cations to run on distributed-memory computer systems is Message Passing Interface (MPI). It is
possible to use MPI to take advantage of shared-memory multicore with C# and .NET Framework.
However, MPI’s main focus is to help developing applications run on clusters. Thus, it adds a big
overhead that isn’t necessary in shared-memory multicore, where all the cores can access the mem-
ory without the need to send messages.
Figure 1-3 shows a distributed-memory computer system with three machines. Each machine has a
quad-core microprocessor, and a shared-memory architecture for these cores. This way, the private
memory for each microprocessor acts as a shared memory for its four cores.
A distributed-memory system forces you to think about the distribution of the data, because each
message to retrieve remote data can introduce an important latency. Because you can add new
machines (nodes) to increase the number of microprocessors for the system, distributed-memory
systems can offer great scalability.
4 x CHAPTER 1 TASK-BASED PROGRAMMING
Microprocessor #0
Core #0
Private memory for
Microprocessor #0
Communication channel between the microprocessors
Microprocessor #1
Core #0
Private memory for
Microprocessor #1
Microprocessor #2
Core #0
Private memory for
Microprocessor #2
FIGURE 1-2
Parallel Programming and Multicore Programming
Traditional sequential code, where instructions run one after the other, doesn’t take advantage of
multiple cores because the serial instructions run on only one of the available cores. Sequential code
written with Visual C# 2010 won’t take advantage of multiple cores if it doesn’t use the new features
offered by .NET Framework 4 to split the work into many cores. There isn’t an automatic parallel-
ization of existing sequential code.
Parallel programming is a form of programming in which the code takes advantage of the paral-
lel execution possibilities offered by the underlying hardware. Parallel programming runs many
instructions at the same time. As previously explained, there are many different kinds of parallel
architectures, and their detailed analysis would require a complete book dedicated to the topic.
Multicore programming is a form of programming in which the code takes advantage of the
multiple execution cores to run many instructions in parallel. Multicore and multiprocessor compu-
ters offer more than one processing core in a single machine. Hence, the goal is to do more in less
time by distributing the work to be done in the available cores.
Modern microprocessors can execute the same instruction on multiple data, something classified by
Michael J. Flynn in his proposed Flynn’s taxonomy in 1966 as Single Instruction, Multiple Data
(SIMD). This way, you can take advantage of these vector processors to reduce the time needed to
execute certain algorithms.
Understanding Hardware Threads and Software Threads x 5
Communication channel between the microprocessors
Microprocessor #0
Core
#0
Core
#1
Core
#2
Core
#n
Private memory for
Microprocessor #0 and
shared memory for its
four cores
Microprocessor #1
Core
#0
Core
#1
Core
#2
Core
#n
Private memory for
Microprocessor #1 and
shared memory for its
four cores
Microprocessor #2
Core
#0
Core
#1
Core
#2
Core
#n
Private memory for
Microprocessor #2 and
shared memory for its
four cores
FIGURE 1-3
This book covers two areas of parallel programming in great detail: shared-memory multicore
programming and the usage of vector-processing capabilities. The overall goal is to reduce the
execution time of the algorithms. The additional processing power enables you to add new
features to existing software, as well.
UNDERSTANDING HARDWARE THREADS AND
SOFTWARE THREADS
A multicore microprocessor has more than one physical core — real independent processing units
that make it possible to run instructions at the same time, in parallel. In order to take advantage of
multiple physical cores, it is necessary to run many processes or to run more than one thread in a
single process, creating multithreaded code.
However, each physical core can offer more than one hardware thread, also known as a logi-
cal core or logical processor. Microprocessors with Intel Hyper-Threading Technology (HT or
HTT) offer many architectural states per physical core. For example, many microprocessors
with four physical cores with HT duplicate the architectural states per physical core and offer
eight hardware threads. This technique is known as simultaneous multithreading (SMT) and
it uses the additional architectural states to optimize and increase the parallel execution at the
6 x CHAPTER 1 TASK-BASED PROGRAMMING
microprocessor’s instruction level. SMT isn’t restricted to just two hardware threads per physical
core; for example, you could have four hardware threads per core. This doesn’t mean that each
hardware thread represents a physical core. SMT can offer performance improvements for mul-
tithreaded code under certain scenarios. Subsequent chapters provide several examples of these
performance improvements.
Each running program in Windows is a process. Each process creates and runs one or more threads,
known as software threads to differentiate them from the previously explained hardware threads.
A process has at least one thread, the main thread. An operating system scheduler shares out the
available processing resources fairly between all the processes and threads it has to run. Windows
scheduler assigns processing time to each software thread. When Windows scheduler runs on a mul-
ticore microprocessor, it has to assign time from a hardware thread, supported by a physical core, to
each software thread that needs to run instructions. As an analogy, you can think of each hardware
thread as a swim lane and a software thread as a swimmer.
Each software thread shares the private unique memory space with its parent
process. However it has its own stack, registers, and a private local storage.
Windows recognizes each hardware thread as a schedulable logical processor. Each logical proces-
sor can run code for a software thread. A process that runs code in multiple software threads can
take advantage of hardware threads and physical cores to run instructions in parallel. Figure 1-4
shows software threads running on hardware threads and on physical cores. Windows scheduler
can decide to reassign one software thread to another hardware thread to load-balance the work
done by each hardware thread. Because there are usually many other software threads waiting for
processing time, load balancing will make it possible for these other threads to run their instruc-
tions by organizing the available resources. Figure 1-5 shows Windows Task Manager displaying
eight hardware threads (logical cores and their workloads).
Load balancing refers to the practice of distributing work from software
threads among hardware threads so that the workload is fairly shared across
all the hardware threads. However, achieving perfect load balance depends on
the parallelism within the application, the workload, the number of software
threads, the available hardware threads, and the load-balancing policy.
Windows Task Manager and Windows Resource Monitor show the CPU usage
history graphics for hardware threads. For example, if you have a microproces-
sor with four physical cores and eight hardware threads, these tools will display
eight independent graphics.
Understanding Hardware Threads and Software Threads x 7
Physical
Core #0
Physical
Core #1
Process with 6 Software Threads
Software
Thread #2
Main
Software
Thread
(Software
Thread #0)
Hardware
Thread #1
Hardware
Thread #2
Hardware
Thread #3
Hardware
Thread #4
Hardware
Thread #5
Hardware
Thread #6
Hardware
Thread #7
Hardware
Thread #0
Software
Thread #5
Software
Thread #5
Software
Thread #4
Software
Thread #3
Physical
Core #2
Physical
Core #3
Main Memory (Shared-Memory)
Main
Software
Thread
(Software
Thread #0) Software
Thread #1
FIGURE 1-4
D
ownload
from
Wow!
eBook
<www.wowebook.com>
8 x CHAPTER 1 TASK-BASED PROGRAMMING
FIGURE 1-5
Windows runs hundreds of software threads by assigning chunks of processing time to each avail-
able hardware thread. You can use Windows Resource Monitor to view the number of software
threads for a specific process in the Overview tab. The CPU panel displays the image name for each
process and the number of associated software threads in the Threads column, as shown in
Figure 1-6 where the vlc.exe process has 32 software threads.
FIGURE 1-6
Core Parking is a Windows kernel power manager and kernel scheduler technology designed to
improve the energy efficiency of multicore systems. It constantly tracks the relative workloads of
every hardware thread relative to all the others and can decide to put some of them into sleep mode.
Core Parking dynamically scales the number of hardware threads that are in use based on workload.
When the workload for one of the hardware threads is lower than a certain threshold value, the
Core Parking algorithm will try to reduce the number of hardware threads that are in use by park-
ing some of the hardware threads in the system. In order to make this algorithm efficient, the kernel
scheduler gives preference to unparked hardware threads when it schedules software threads. The
kernel scheduler will try to let the parked hardware threads become idle, and this will allow them to
transition into a lower-power idle state.
Understanding Hardware Threads and Software Threads x 9
Core Parking tries to intelligently schedule work between threads that are running on multiple
hardware threads in the same physical core on systems with microprocessors that include HT. This
scheduling decision decreases power consumption.
Windows Server 2008 R2 supports the complete Core Parking technology. However, Windows 7
also uses the Core Parking algorithm and infrastructure to balance processor performance between
hardware threads with microprocessors that include HT. Figure 1-7 shows Windows Resource
Monitor displaying the activity of eight hardware threads, with four of them parked.
FIGURE 1-7
Regardless of the number of parked hardware threads, the number of hardware threads returned by
.NET Framework 4 functions will be the total number, not just the unparked ones. Core Parking tech-
nology doesn’t limit the number of hardware threads available to run software threads in a process.
Under certain workloads, a system with eight hardware threads can turn itself into a system with
two hardware threads when it is under a light workload, and then increase and spin up reserve hard-
ware threads as needed. In some cases, Core Parking can introduce an additional latency to schedule
10 x CHAPTER 1 TASK-BASED PROGRAMMING
many software threads that try to run code in parallel. Therefore, it is very important to consider
the resultant latency when measuring the parallel performance.
UNDERSTANDING AMDAHL’S LAW
If you want to take advantage of multiple cores to run more instructions in less time, it is necessary
to split the code in parallel sequences. However, most algorithms need to run some sequential code to
coordinate the parallel execution. For example, it is necessary to start many pieces in parallel and then
collect their results. The code that splits the work in parallel and collects the results could be sequen-
tial code that doesn’t take advantage of parallelism. If you concatenate many algorithms like this,
the overall percentage of sequential code could increase and the performance benefits achieved may
decrease.
Gene Amdahl, a renowned computer architect, made observations regarding the maximum
performance improvement that can be expected from a computer system when only a fraction of the
system is improved. He used these observations to define Amdahl’s Law, which consists of the
following formula that tries to predict the theoretical maximum performance improvement (known
as speedup) using multiple processors. It can also be applied with parallelized algorithms that are
going to run with multicore microprocessors.
Maximum speedup (in times) = 1 / ((1 - P) + (P/N))
where:
‰ P is the portion of the code that runs completely in parallel.
‰ N is the number of available execution units (processors or physical cores).
According to this formula, if you have an algorithm in which only 50 percent (P = 0.50) of its
total work is executed in parallel, the maximum speedup will be 1.33x on a microprocessor with
two physical cores. Figure 1-8 illustrates an algorithm with 1,000 units of work split into 500
units of sequential work and 500 units of parallelized work. If the sequential version takes 1,000
seconds to complete, the new version with some parallelized code will take no less than
750 seconds.
Maximum speedup (in times) = 1 / ((1 - 0.50) + (0.50 / 2)) = 1.33x
The maximum speedup for the same algorithm on a microprocessor with eight physical cores will be
a really modest 1.77x. Therefore, the additional physical cores will make the code take no less than
562.5 seconds.
Maximum speedup (in times) = 1 / ((1 - 0.50) + (0.50 / 8)) = 1.77x
Figure 1-9 shows the maximum speedup for the algorithm according to the number of physical
cores, from 1 to 16. As you can see, the speedup isn’t linear, and it wastes processing power as the
number of cores increases. Figure 1-10 shows the same information using a new version of the algo-
rithm in which 90 percent (P = 0.90) of its total work is executed in parallel. In fact, 90 percent of
Understanding Amdahl’s Law x 11
parallelism is a great achievement, but it results in a 6.40x speedup on a microprocessor with
16 physical cores.
Maximum speedup (in times) = 1 / ((1 - 0.90) + (0.90 / 16)) = 6.40x
Original sequential version
Total work: 1,000 units
Sequential work: 1,000 units
Optimized version
Sequential work:
250 units
2 physical cores
8 physical cores
250 units on each physical core
62 or 63 units on each physical core
Completely parallelized work:
500 units
Sequential work:
250 units
Total work: 1,000 units
FIGURE 1-8
12 x CHAPTER 1 TASK-BASED PROGRAMMING
16
14
12
10
8
6
4
2
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Number of physical cores Maximum speedup
FIGURE 1-9
16
14
12
10
8
6
4
2
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Number of physical cores Maximum speedup
FIGURE 1-10
Considering Gustafson’s Law x 13
Amdahl’s Law takes into account changes in the number of physical cores, but
it doesn’t consider potential new features that you could add to existing applica-
tions to take advantage of the additional parallel processing power. For example,
you can create new algorithms that take advantage of the additional cores
while you run other algorithms in parallel that don’t achieve great performance
improvements when they run with more than three cores. You can create designs
that consider different parallelism scenarios to reduce the impact of Amdahl’s
Law. The applications have to evolve as the hardware offers new capabilities.
CONSIDERING GUSTAFSON’S LAW
John Gustafson noticed that Amdahl’s Law viewed the algorithms as fixed, while considering the
changes in the hardware that runs them. Thus, he suggested a reevaluation of this law in 1988. He
considers that speedup should be measured by scaling the problem to the number of processors and
not by fixing the problem size. When the parallel-processing possibilities offered by the hardware
increase, the problem workload scales.
Gustafson’s Law provides the following formula with the focus on the problem size to measure the
amount of work that can be performed in a fixed time:
Total work (in units) = S + (N × P)
where:
‰ S represents the units of work that run with a sequential execution.
‰ P is the size of each unit of work that runs completely in parallel.
‰ N is the number of available execution units (processors or physical cores).
You can consider a problem composed of 50 units of work with a sequential execution. The problem
can also schedule parallel work in 50 units of work for each available core. If you have a micropro-
cessor with two physical cores, the maximum amount of work is going to be 150 units.
Total work (in units) = 50 + (2 × 50) = 150 units of work
Figure 1-11 illustrates an algorithm with 50 units of work with a sequential execution and a parallel-
ized section. The latter scales according to the number of physical cores. This way, the parallelized
section can process scalable, parallelizable 50 units of work. The workload in the parallelized section
increases when more cores are available. The algorithm can process more data in less time if there
are enough additional units of work to process in the parallelized section. The same algorithm can
run on a microprocessor with eight physical cores. In this case, it will be capable of processing
450 units of work in the same amount of time required for the previous case:
Total work (in units) = 50 + (8 × 50) = 450 units of work
14 x CHAPTER 1 TASK-BASED PROGRAMMING
Total work: 150 units
Completely parallelized work:
100 units
Sequential work:
25 units
Sequential work:
25 units
8 physical cores
2 physical cores
50 units on each physical core
Total work: 450 units
Sequential work:
25 units
Completely parallelized work:
400 units
Sequential work:
25 units
50 units on each physical core
FIGURE 1-11
Figure 1-12 shows the speedup for the algorithm according to the number of physical cores, from
1 to 16. This speedup is possible provided there are enough units of work to process in parallel
when the number of cores increases. As you can see, the speedup is better than the results offered
by applying Amdahl’s Law. Figure 1-13 shows the total amount of work according to the number of
available physical cores, from 1 to 32.
Sometimes, the amount of time spent in sequential sections of the program depends
on the problem size. In these cases, you can scale the problem size in order to
improve the chances of achieving better speedups than the ones calculated by
Amdahl’s Law. However, some problems have limits in the volume of data to be
processed in parallel that can scale. When this happens, you can add new features
to take advantage of the parallel processing power available in modern hardware, or
you can work with different designs. Subsequent chapters teach many techniques to
prepare algorithms to improve the total work calculated by Gustafson’s Law.
Considering Gustafson’s Law x 15
16
14
12
10
8
6
4
2
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Number of physical cores Speedup
FIGURE 1-12
1800
1600
1400
1200
1000
800
Total
work
(In
units)
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Number of physical cores
15 16 17 18 19 20 21 22 23 24 2526 27 28 29 30 31 32
600
400
200
0
FIGURE 1-13
16 x CHAPTER 1 TASK-BASED PROGRAMMING
Figure 1-14 illustrates many algorithms composed of several units of work with a sequential execu-
tion and parallelized sections. The parallelized sections scale as the number of available cores
increases. The impact of the sequential sections decreases as more scalable parallelized sections run
units of work. In this case, it is necessary to calculate the total units of work for both the sequential
and parallelized sections and then apply them to the formula to find out the total work with eight
physical cores:
Total sequential work (in units) = 25 + 150 + 100 + 150 = 425 units of work
Total parallel unit of work (in units) = 50 + 200 + 300 = 550 units of work
Total work (in units) = 425 + (8 × 550) = 4,825 units of work
Sequential work
300 units 300 units 300 units 300 units
300 units 300 units
150 units
100 units
150 units
200 units 200 units 200 units
200 units 200 units 200 units
50 units 50 units 50 units
50 units 50 units 50 units
25 units
Completely parallelized work
Completely parallelized work
Sequential work
Completely parallelized work
Sequential work
Sequential work
FIGURE 1-14
A sequential execution would be capable of executing only 975 units of work in the same amount
of time:
Total work with a sequential execution (in units) =
25 + 50 + 150 + 200 + 100 + 300 + 150 = 975 units of work
WORKING WITH LIGHTWEIGHT CONCURRENCY
Neither Amdahl’s Law nor Gustafson’s Law takes into account the overhead introduced by paral-
lelism. Nor do they consider the existence of patterns that allow the transformation of sequential
parts into new algorithms that can take advantage of parallelism. It is very important to reduce the
sequential code that has to run in applications to improve the usage of the parallel execution units.
Creating Successful Task-Based Designs x 17
In previous .NET Framework versions, if you wanted to run code in parallel in a C# application
(a process) you had to create and manage multiple threads (software threads). Therefore, you had to
write complex multithreaded code. Splitting algorithms into multiple threads, coordinating the dif-
ferent units of code, sharing information between them, and collecting the results are indeed complex
programming jobs. As the number of logical cores increases, it becomes even more complex, because
you need more threads to achieve better scalability.
The multithreading model wasn’t designed to help developers tackle the multicore revolution. In fact,
creating a new thread requires a lot of processor instructions and can introduce a lot of overhead for
each algorithm that has to be split into parallelized threads. Many of the most useful structures and
classes were not designed to be accessed by different threads, and, therefore, a lot of code had to be
added to make this possible. This additional code distracts the developer from the main goal: achiev-
ing a performance improvement through parallel execution.
Because this multithreading model is too complex to handle the multicore revolution, it is known as
heavyweight concurrency. It adds an important overhead. It requires adding too many lines of code
to handle potential problems because of its lack of support of multithreaded access at the framework
level, and it makes the code complex to understand.
The aforementioned problems associated with the multithreading model offered by previous .NET
Framework versions and the increasing number of logical cores offered in modern microprocessors
motivated the creation of new models to allow creating parallelized sections of code. The new model is
known as lightweight concurrency, because it reduces the overall overhead needed to create and execute
code in different logical cores. It doesn’t mean that it eliminates the overhead introduced by parallelism,
but the model is prepared to work with modern multicore microprocessors. The heavyweight concur-
rency model was born in the multiprocessor era, when a computer could have many physical micropro-
cessors with one physical core in each. The lightweight concurrency model takes into account the new
microarchitectures in which many logical cores are supported by some physical cores.
The lightweight concurrency model is not just about scheduling work in different logical cores. It
also adds support of multithreaded access at the framework level, and it makes the code much
simpler to understand.
Most modern programming languages are moving to the lightweight concurrency model. Luckily,
.NET Framework 4 is part of this transition. Thus, all the managed languages that can generate .NET
applications can take advantage of the new model.
CREATING SUCCESSFUL TASK-BASED DESIGNS
Sometimes, you have to optimize an existing solution to take advantage of parallelism. In these
cases, you have to understand an existing sequential design or a parallelized algorithm that offers a
reduced scalability, and then you have to refactor it to achieve a performance improvement without
introducing problems or generating different results. You can take a small part or the whole problem
and create a task-based design, and then you can introduce parallelism. The same technique can be
applied when you have to design a new solution.
You can create successful task-based designs by following these steps:
1. Split each problem into many subproblems and forget about sequential execution.
18 x CHAPTER 1 TASK-BASED PROGRAMMING
2. Think about each subproblem as any of the following:
Data that can be processed in parallel — Decompose data to achieve parallelism.
Data flows that require many tasks and that could be processed with some kind of
complex parallelism — Decompose data and tasks to achieve parallelism.
Tasks that can run in parallel — Decompose tasks to achieve parallelism.
3. Organize your design to express parallelism.
4. Determine the need for tasks to chain the different subproblems. Try to avoid dependencies
as much as possible.
5. Design with concurrency and potential parallelism in mind.
6. Analyze the execution plan for the parallelized problem considering current multicore micro-
processors and future architectures. Prepare your design for higher scalability.
7. Minimize critical sections as much as possible.
8. Implement parallelism using task-based programming whenever possible.
9. Tune and iterate.
The aforementioned steps don’t mean that all the subproblems are going to be parallelized tasks
running in different threads. The design has to consider the possibility of parallelism and then,
when it is time to code, you can decide the best option according to the performance and scal-
ability goals. It is very important to think in parallel and split the work to be done into tasks. This
way, you will be able to parallelize your code as needed. If you have a design prepared for a classic
sequential execution, it is going to take a great effort to parallelize it by using task-based program-
ming techniques.
You can combine task-based designs with object-oriented designs. In fact, you
can use object-oriented code to encapsulate parallelism and create parallelized
objects and components.
Designing With Concurrency in Mind
When you design code to take advantage of multiple cores, it is very important to stop thinking
that the code inside a C# application is running alone. C# is prepared for concurrent code, mean-
ing that many pieces of code can run inside the same process simultaneously or with an interleaved
execution. The same class method can be executed in concurrent code. If this method saves a state
in a static variable and then uses this saved state later, many concurrent executions could yield unex-
pected and unpredictable results.
As previously explained, parallel programming for multicore microprocessors works with the
shared-memory model. The data resides in the same shared memory, which could lead to unex-
pected results if the design doesn’t consider concurrency.
Creating Successful Task-Based Designs x 19
It is a good practice to prepare each class and method to be able to run concurrently, without side
effects. If you have classes, methods, or components that weren’t designed with concurrency in
mind, you would have to test their designs before using them in parallelized code.
Each subproblem detected in the design process should be capable of running while the other
subproblems are being executed concurrently. If you think that it is necessary to restrict concurrent
code when a certain subproblem runs because it uses legacy classes, methods, or components, it
should be made clear in the design documents. Once you begin working with parallelized code, it is
very easy to incorporate other existing classes, methods, and components that create undesired side
effects because they weren’t designed for concurrent execution.
Understanding the Differences between Interleaved Concurrency,
Concurrency, and Parallelism
Figure 1-15 illustrates the differences between interleaved concurrency and concurrency when there
are two software threads and each one executes four instructions. The interleaved concurrency sce-
nario executes one instruction for each thread, interleaving them, but the concurrency scenario runs
two instructions in parallel, at the same time. The design has to be prepared for both scenarios.
Concurrency requires physically simultaneous processing to happen.
Parallelism entails partitioning work to be done, running processing on those
pieces concurrently, and joining the results. Parallelizing a problem generates
concurrency.
Parallelized code can run in many different concurrency and interleaved concurrency scenarios, even
when it is executed in the same hardware configuration. Thus, one of the great challenges of a paral-
lel design is to make sure that its execution with different possible valid orders and interleaves will
lead to the correct result, otherwise known as correctness. If you need a specific order or certain
parts of the code don’t have to run together, it is necessary to make sure that these parts don’t run
concurrently. You cannot assume that they don’t run concurrently because you run it many times
and it produces the expected results. When you design for concurrency and parallelism, you have to
make sure that you consider correctness.
In the next chapter, you will learn more about the differences between concurrency and parallelism
by looking at various code samples.
Parallelizing Tasks
Visual C# 2010 and .NET Framework 4 make it easy to transform task-based designs into parallel-
ized code. However, it is very important to understand that the parallelized code requires specific
testing and tuning procedures in order to achieve the expected goals. You will learn about them
through the rest of the book.
When you parallelize the tasks, the overhead introduced by parallelism can have an important
impact and may require testing different alternatives. As previously explained, modern multicore
microprocessors are extremely complex, and it is necessary to test the results offered by different
parallelizing techniques until you can make your choice. In fact, the same happens with sequential
20 x CHAPTER 1 TASK-BASED PROGRAMMING
code, but the difference is that you already know that a foreach loop is slower than a for loop.
While parallelizing tasks, a parallelized version of a for loop can offer many different performance
results according to certain parameters that determine the way the parallelized tasks are executed.
Once you experience these scenarios, you will be able to consider them when you have to write the
code for similar problems and analogous task-based designs.
Interleaved concurrency
Concurrency with a physical simultaneous processing
Time Thread # Instruction #
Time Thread #
Instruction #
Thread # Instruction #
t0 1
0
0 0
t1 1
1
0 1
t2 1
2
0 2
t3 1
3
0 3
t0
t1
t2
t3
t4
t5
t6
t7
0
0
1
1
2
2
3
3
0
1
0
1
0
1
0
1
FIGURE 1-15
Exploring the Variety of Random
Documents with Different Content
Priessnitz says it is difficult to prescribe for these complaints at a
distance; and that except in young people, or where the disease is in
its infancy, a cure is seldom effected. It is however always safe to
adopt the following treatment, which will refresh and strengthen the
patient.
Three rubbing-sheets, at intervals during the day.
One or two foot-baths, but NO sitz-baths without advice.
If the feet swell, continue the treatment, all the same, rub with wet
hands, and bandage the legs, from the ankle to the knee, this will
reduce the swelling.
Spine complaint and general debility.—A lady.
Morning, packing-sheet until warm, followed by plunge-bath one
minute; noon, douche three minutes, return home and then take a
rubbing-sheet and sitz-bath, twenty minutes; afternoon, as in the
morning.
Rubbed the back and nape of the neck with wet hands, twice a day.
Patient staid all the winter; during which time symptoms were
combated as they arose, she gained strength and flesh.
Spinal affection.—A young lady, after submitting to all sorts of
medical treatment for three or four years, came to Gräfenberg. She
was clothed in flannel, suffered greatly from indigestion,
constipation, and languid circulation, feet always cold, walking a
short distance brought on pain in the back.
Second day after her arrival, Priessnitz ordered,—
“Put aside all flannel, go as lightly clad as possible, keep bed-room
window open day and night, and sleep with only a single sheet as a
covering, leave off stockings and run bare-footed on the wet grass
near the house, or on the cold stones of the passage for half an
hour before breakfast in the morning.
“Eat black bread and drink sour milk, lie on the stomach and have
the spine rubbed several times a day with wet hands.”
First four days, patient had cold feet in and after the packing-sheet,
this was then followed by tepid, then cold, and back to tepid-bath,
feet well rubbed, previous to going into packing-sheet, and last thing
at night; by this treatment head-ache was relieved and the feet
became warmer.
In ten days began the douche for one minute; digestion improved;
no longer constipated. Bandages always round the body, and to feet
and legs at night.
Patient was at Gräfenberg nine months, during which time the
treatment was often changed to meet circumstances. One time,
suppressed catamenia was relieved in two days by sixteen rubbing-
sheets a day. At another, patient met with an accident in the leg;
Priessnitz to keep this to the surface, ordered more water to be
drunk. This patient left Gräfenberg in excellent health, though not
entirely cured of the affection of the spine, that being out of its
perpendicular position.
Pain in the Shoulder and Chest.—A lady in the treatment complained
of pain in the shoulder and left breast, and down the side.
Ordered, when in sitz-bath the upper part of the body to be well
rubbed.
Body bandage to be more wrung out than usual, and extra covering
over it.
Pain in the side, Chronic cold in the head.—A German officer aged
50, afflicted as above, and with continued stoppage in the nose, and
frequent head-aches, was told by his medical man that he had no
chance of being cured, was completely relieved at Gräfenberg, in
three or four months.
Packing-sheets and tepid baths twice a day. Rubbing-sheet and sitz-
baths were resorted to for a short time, the cold bath substituted for
the tepid bath, and to this treatment the douche was added.
Weak Chest and Worms.—A child three years old. Wash with tepid
water, 12° once, and after some time twice a day.
Wear body bandage always, and drink water.
Pain in the Chest.—A gentleman had pain in his chest, like the hurt
from a blow, about the size of a crown-piece.
Ordered sixteen rubbing-sheets a day, four at each time.
LXXI.—Constipation.
This complaint is always relieved, and if sufficient time is devoted to
the treatment, finally overcome by Hydropathy; space forbids my
going into details, or numerous cases might be given in proof of this
assertion. The reader’s attention may however be called to the letter
addressed to a newspaper, and signed by upwards of one hundred
patients, giving the case of the son of Prince Leichtenstein, who was
cured in a few days of Constipation, which had endured twenty-eight
days in defiance of all medical aid. To effect a permanent cure, the
treatment must be persevered in for a long time, very often a
twelvemonth.
In a recent case. Rubbing-sheets until feverish heat ceases:
sometimes four or three suffice; at others the number must be
increased to sixteen or twenty, to be immediately followed by a
clyster. Then take a walk, and on returning, a sitz-bath fifteen to
twenty minutes, the abdomen to be well rubbed the whole time.
Body bandage to be worn always and often changed. This treatment
to be resorted to twice a day. Great exercise to be used, and cold
light food to be partaken of.
A delicate lady who had suffered from this complaint for upwards of
twenty years, was relieved in a fortnight, and had no return of it
during her stay at Gräfenberg. Her principal treatment was:—
Packing-sheet and bath twice a day. Rubbing-sheet and sitz-bath at
noon.
A second case, which came under my observation, was that of a
Russian, who for many years had only been relieved by medicine or
enemas. He went to an establishment at Moscow for six months,
where he derived great benefit, though he still used enemas. At
Gräfenberg he abandoned the latter, his bowels were relaxed and
have continued so ever since.
LXXII.—Indigestion.
Foul tongue and pain at the pit of the stomach; a lady having tried
all other remedies, was ordered the following, which answered
admirably.
Three cold sitz-baths a day, for an hour each time, rubbing the
abdomen the whole time, eat nothing but brown bread and drink
sour milk during three days.
Loss of Appetite, Foul Tongue, etc.—Patient had foul tongue, and
loss of appetite.
Morning.—Sweating and tepid bath, stomach to be well rubbed in
the bath. Sitz-bath thirty minutes in the afternoon.
It is very essential to drink abundantly of water, and take great
exercise.
A child five years old. Pale, foul tongue, loss of appetite, thirsty and
awaking with screams. Ablution in the morning, and three tepid sitz-
baths daily four minutes each; chest, back, and abdomen to [be]
well rubbed all the time; waist bandage day night. Drink as much
water as possible. Cured in three months.
LXXIII.—Stomach Complaint.
Patient’s stomach deranged, food used to return to his mouth:
difficult of cure. His second visit to Gräfenberg, cured in nine
months. Packing-sheets and rubbing-sheets. Noon, douche, rubbing-
sheet and sitz-bath; afternoon, packing-sheet and bath.
LXXIV.—Throwing Food off the Stomach.
Morning, rubbing-sheet and sitz-bath fifteen minutes. Noon, the
same repeated. Afternoon, sitz-bath.
A gentleman of my acquaintance pursued three or four months’
treatment for this complaint, and left Gräfenberg without being
cured.
LXXV.—Heartburn.
Drink large quantit[i]es of water fasting, rub the part with wet hands
and wear a large bandage, changed often, round the waist. If this
does not effect a cure, take a rubbing-sheet or two and a tepid sitz-
bath twice a day. Nausea and sickness are to be treated in the same
manner; if, however, the latter become chronic, then packing-sheets,
tepid baths, and sitz-baths must be resorted to. The diet should be
brown bread and milk only. The milk should be boiled, if it otherwise
disagrees with the patient.
LXXVI.—Sea Sickness.
To avoid sea-sickness or relieve it. The traveller should lay on his
back, and place a large wet towel on his abdomen, changing it when
dry. After a sea voyage take a few rubbing-sheets and sitz-baths.
Wear a waist bandage, and if constipated resort to cold water
clysters.
LXXVII.—Palpitation of the Heart.
Many rubbing-sheets; rub the whole, side for a long time and often.
Large bandage. Two sitz-baths a day, fifteen minutes each; rubbing
the afflicted side the whole time. A lady afflicted as above was
relieved in ten minutes by the rubbing-sheets, and dabbling her feet
well in cold water.
LXXVIII.—Want of Sleep.
Before going to bed, take a shallow foot-bath (only to cover the
soles of the foot) for seven to ten minutes, rubbing the feet to above
the ankles all the time, then walk about the room bare-footed until
the feet are quite warm.
A lady, in the treatment, complained of want of sleep.
Two packing-sheets in the afternoon, the first changed as soon as
hot, followed by tepid bath.
Two foot-baths for one hour each, the water only to cover the soles
of the feet. Feet to be well rubbed the whole time. When the servant
is tired of rubbing, patient should walk about the room with bare
feet for a few minutes and then resume the foot-bath.
LXXIX.—Languid Circulation.
I attended many cases of this kind with Mr. Priessnitz, where the
languid circulation arose from using the head more than the body. In
a general way he began with rubbing-sheets in the morning and
afternoon for a few days, and then in the morning packing-sheet
until warm, and tepid bath, cold bath, and back to tepid bath. Noon,
rubbing-sheet and tepid sitz-bath fifteen minutes; afternoon,
packing-sheet and tepid baths as in the morning; or a rubbing-sheet.
Bandaged always.
LXXX.—Ring Worm.
A boy aged seven years had ring worm over the eye and behind his
knees. Cured in six weeks. Two packing-sheets and tepid baths daily.
Bandage to the knees. Child could not endure the douche.
LXXXI.—Hands Frost-bitten or Suffering from a
Boil.
Rub the hands well with tepid water, and particularly the wrist. Put
the elbow into cold water for twenty minutes, three times a day.
Bandage the whole arm from the arm-pits down to the wrist.
LXXXII.—Weak Eyes and Eruption on the Head.
A child two years old had weak eyes, from which there was a
constant discharge and an eruption on the face and head; it was
treated as follow:—
Packing-sheet one hour and sometimes longer, followed by tepid
bath. Large bandage from hips to arm-pits night and day. Dabbed
the face often with cold water and bandaged the head at night. In
three weeks eyes quite well and the eruption diminished.
LXXXIII.—Weak Ankles.
If an infant, ablution every morning and bandage the ankles night
and day. If an older person, ablution and foot-baths twenty minutes.
Morning and afternoon, bandage always.
LXXXIV.—Treatment of Infants.
Immediately after birth bathe the infant in warm water 82°, put a
wet bandage on navel, bound on with a dry one, change it morning
and evening only. Continue this until the navel is healed. The
temperature of the bath to be reduced two degrees every fortnight,
until 68°, which is to be used until child can run alone. It may be
washed with cold water at three months of age.
If an Infant is uneasy or restless and cries.—Put on a body bandage;
if this is not sufficient, give it an extra tepid-bath.
The child of an Hungarian commissioner was born weak and sickly,
with great difficulty in breathing. The physicians treated the mother
to improve the milk, when the child refused the breast. From three
days old it was spoon-fed. On the fifth or sixth day, the father put
the child into a packing-sheet until it was warm, when he changed it,
and then applied the tepid-bath.
After four day’s treatment a lump appeared on the chest, which
increased until it became as large as a man’s fist. On the eighth day
it broke, and half a tumbler of matter was discharged. From this
moment the child gradually improved and is now the healthiest of
his children.
Child-teething, Pain in the Head, and Diarrhœa.—Tepid bath for
about five minutes three times a day.
Two head-baths from ten to fifteen minutes each, and one clyster.
A body bandage, and change it often.
LXXXV.—Epilepsy.
This complaint in a general way is not to be cured by Hydropathy;
but Priessnitz thinks persons subject to it should use cold baths, and
cold water as a beverage. I know a young man who was six months
at Gräfenberg, it is now twelve-months since, and as he has not had
an attack, he considers himself cured.
LXXXVI.—Hypochondria and Hysteria.
A disarrangement of the system, and inaction of the abdomen,
cause much uneasiness and discontent. This disease being moral as
well as physical, requires pure air, scenery, society, and a complete
change in the manner of living. What is so calculated to combat this
complaint as Hydropathy?
A patient became hypochondriac, in consequence of chronic
derangement of bowels, struck with rush of blood to the head, face
became crimson, lost speech and consciousness, had convulsions
and spasmodic movement of the arms.
First operation was to put him into a cold bath, and use strong
friction for an hour. He was put into a packing-sheet, in which he
became delirious; he was then rubbed by four men in a tepid bath,
64°. He was still unconscious and yet winced on being pinched;
water thrown on his head caused a slight cry; great heat on the
head. On ceasing the cold affusion, pulse though oppressed began
to be felt—eyes fixed—conjunctiva inflamed.
Friction continued two hours, then ceased for one hour and a half,
and begun again: in an hour spasms ceased, eyes began to move,
without seeing. Patient apparently exhausted, pulse gained its
power, though still often intermittent, upper part of the body hot,
lower extremities could not be warmed all night, consciousness had
not returned in the morning, pulse better, but sleep interrupted,—
patient groaning. All night wet bandage applied to the head. At 6
o’clock next morning, sweating process, perspiration preceded
consciousness, up to which moment patient was insensible to all that
had occurred. After half an hour’s sweating, he was well rubbed in
tepid bath 66°, and put to bed, when he slept. On awaking he
partook of bread and milk.
At 2 o’clock p.m., awoke covered with perspiration, and from that
time until next morning, slept at intervals, pulse regular, talked
calmly and rationally, bowels in a normal state.
In the morning, packing-sheet; and later, sweating process; both
followed by tepid bath 64°—temperature of the body still high. After
good night’s rest, appetite returned, and so much better as to renew
the treatment to effect a cure of that which brought him to
Gräfenberg.
LXXXVII.—Fœtid Perspiration of the Feet.
This is relieved by foot-baths, and wearing a bandage on the feet at
night; but it cannot be cured without the sweating process.
LXXXVIII.—Stricture.
Sweating and tepid bath, and cold sitz-baths, are generally resorted
to in this complaint. If cold water is found too severe, tepid is used
for a time; a bandage is always applied to parts affected.
For stoppage of the water, three to six rubbing-sheets; if they fail,
resort to sweating process until water comes, then a tepid bath, or
rubbing-sheet.
Medical men, to effect this object, put the patient first into a warm
bath, and then bleed him until he faints: by these means, the
prostate gland becomes relaxed, and water flows; or water is passed
by the use of catheters, which at Gräfenberg are always dispensed
with.
LXXXIX.—Inflammation of the Kidneys And
Urethra.
The treatment must be regulated by circumstances: sometimes
sweating, at other times the packing-sheet, tepid bath, and
bandage.
XC.—Hydrocephalus.
A child one year and a-half old had water on the brain, and a large
protuberance in the middle of the forehead. Ordered, a tepid bath
morning and evening; a rubbing-sheet after an hour’s sleep at noon,
and repeated before going to bed at night. Drank water only at
meals, and then but little. Bandage from arm-pits down to the
knees; was much in the open air. After twelve months, the
protuberance went down, leaving a ridge like a pigeon’s breast down
the centre; shape of head completely changed, and the boy was
perfectly well.
XCI.—Syphilis.
This complaint always succumbs to the treatment; and a cure
effected by it leaves none of those lamentable consequences which
attend the exhibition of drugs. By Hydropathic means, the virus is
completely thrown out of the system through the pores; whilst the
administration of mercury is attended with secondary symptoms,
which are more fatal than the disease itself. If taken in time,
secondary symptoms are also cured at Gräfenberg. It frequently
happens, that patients treated for another complaint, find syphilis
return, though they imagined themselves cured of it years before.
Recent cases of syphilis in otherwise healthy persons, are generally
cured in less than two months; but the cure of secondary symptoms
is a work of time. There are many sufferers from this undermining
malady, who have been at Gräfenberg one, two, and even three
years. In health, they, are much improved; but the malady is too
deeply seated to be eradicated. One gentleman, when I was there,
was refused admittance; he died in a few days, when it was found
that mercury had eaten part of his wind-pipe away—a result that
never could have been brought about by water. The following is
another deplorable case, the result of bad treatment:—Patient aged
thirty-five, tall, thin, and bent when walking; supports his head by
pressing his hands on each side of it; part of the cranium destroyed.
The brain covered over by a skin; the parietal bones destroyed, and
thick pus exudes between the skin and bone, and smells horribly.
Inside of the left eye is an ulceration with raised borders, which
allows a portion of the orbital arch to be seen surrounded with pus;
pulse weak and irregular; constant pain. Treated for secondary
symptoms, with mercury in 1841; came to Gräfenberg with three
ulcers the size of a shilling on his forehead, with burning pains.
Packing-sheets and tepid baths morning and evening, with other
intermediate treatment. This case is introduced to show the sort of
cases Mr. Priessnitz will undertake: of course, a cure will require a
considerable time.
XCII.—Chancre.
Case of a very strong young man:—
For five days—sweating (after perspiration broke out) morning, one
hour; afternoon, half an hour; then tepid bath, followed by cold bath
and back to tepid. After five days—from sweating went into plunging
cold bath; in another week, douched from two to five minutes at
eleven o’clock; bandage round the body and on the sores, which
were bathed and had water thrown on them frequently; wore
suspending bandages; eat sparingly; no meat or butter, and took but
little exercise. Perfectly cured in six weeks.
XCIII.—Gonorrhœa and Chancres.
Sweating, followed by bath in the morning; douche at eleven; at
twelve, rubbing-sheet and sitz-bath; afternoon, packing-sheet and
bath; chancres increased to the size of a sixpence then, and in two
days cicatrised. Patient cured in twenty-five days.
Gonorrhœa, &c.—Packing-sheet, tepid bath, and sitz-baths were the
means used. The complaint continuing, Priessnitz supposed it arose
from debility of the parts, and ordered:—
Six sitz-baths of ten minutes, allowing five minutes to elapse
between each, twice a-day; packing-sheets to be changed as soon
as warm, followed by cold bath.
A young man, immediately on discovering this complaint, who took
sitz-baths as above described, injected cold water into the urethra,
bandaged the parts and drank plentifully of cold water and lived low;
was cured in two days.
Another person was subject to involuntary emissions, by which his
strength was wasting away. In a month after he began the cure, he
found an old gonorrhœa return (which had evidently been driven
into the system and was the cause of his malady); he was now
treated for this and restored to perfect health.
A Russian officer, declared cured of chancre three years before,
found the complaint return, when he was again treated by mercury.
His throat continued to trouble him, his voice was husky, and piles
began to make their appearance. After pursuing the Water-cure for a
short time, as described in a former case, he had a crisis in his foot,
and diarrhœa for a fortnight, when he passed a considerable
quantity of blood. After this, the piles disappeared entirely, and his
voice became sound and clear. It should be observed that he
sweated alternate mornings only; the other mornings, packing-
sheets and bath.
A young man aged 23, attacked with secondary symptoms: sore
throat, etc., was ordered three packing-sheets and cold baths a-day;
rubbing-sheet and sitz-bath.
I knew another strong young man suffering under secondary
symptoms, so that he could hardly walk with the use of a stick; he
went to Gräfenberg, staid there two months, and returned to
England the picture of health.
As there are always at Gräfenberg a large number of individuals
labouring under these complaints, cases of cure might be adduced
ad infinitum: suffice it to say, that hydropathy in their cure is
omnipotent. Buboes and chancres, when taken in their infancy, are
eradicated from the system in a few weeks, sometimes days,
without the debilitating effects attendant upon other deceitful
remedies.
XCIV.—Scrofula and Vaccination.
Priessnitz, when asked what he conceived to be the cause of such
an increase of scrofula as is said to have taken place of late years,
said, he attributed it to vaccination, syphilis and drugs.
When vaccination is performed without producing its desired effect,
the virus remains in the system, and when it proceeds favourably, it
is a question if it is ever thoroughly ejected.
Every practitioner knows the difficulty that exists of finding children
from which to take matter where no taint is in the blood. The child
subjected to vaccination is not only exposed to the sins of his own
forefathers, but also to those of the stranger.
The consequences attendant upon syphilis, and the evil results of
mineral poisons, are such as to lead us to believe that Priessnitz’
opinion is not without foundation. I am doubtful whether scrofula is
ever cured,7
though whilst at Gräfenberg I saw many obstinate cases
relieved. Children who arrived there perfect cripples, were enabled
to use their limbs like other people. I think I may in great truth say,
that in all cases the enemy received a check, and the general health
of the patient was improved.
A patient states, that previous to inoculation his family were well;
but since that operation they have been scrofulous. He came to
Gräfenberg some years ago from Dartres, when Priessnitz told him
to go home, give up all beverages but water, use cold baths daily,
and he would be well; though incredulous, he followed the advice,
and in two years was perfectly cured.
For scrofula, the whole treatment must be persevered in for a long
time.
XCV.—Piles.
Piles are caused by an accumulation of blood in the vessels which
merge into the large intestines; they either discharge blood, or are
confined to a swelling of the veins, in otherwise healthy subjects.
Hydropathy effects a radical cure of this complaint, whilst medical
remedies are only temporary, and often lead to serious
consequences.
Treatment.—Morning, three rubbing-sheets and sitz-bath, twenty
minutes; noon, the same; afternoon, the same, and an additional
sitz-bath, making four sitz-baths during the day. At night, a rubbing-
sheet but no sitz-bath, as it is too late to walk after it. Body
bandage; much water to be drunk; douche four to eight minutes in
the middle of the day, if possible.
Out of the general treatment, persons troubled with piles may take
sitz-baths and wear a bandage on the part affected.
A patient having piles and sore eyes, was advised neither to take
sitz-baths or eye-baths. When Priessnitz was asked the reason, he
said, “Because you have too much bad matter in your system, which
I am afraid of attracting to those parts.”
In a common attack of piles, two or three sitz-baths a-day, fifteen
minutes each, and wearing a bandage upon the part at night, will
afford relief.
Persons subject to piles should especially avoid all heating and
stimulating drinks.
XCVI.—Rupture.
I knew of a case of double rupture, in an officer 34 years of age,
which was perfectly cured at Gräfenberg in three years. Another
case of single rupture was cured in nine months, and a recent one
cured in four months.
There can be no doubt of the complete omnipotence of Hydropathy
over this malady; its cure is only a matter of time. It is difficult to lay
down any prescribed treatment, as the chief aim of the practitioner
must be to bring his patient into fine health. All organic action is
contraction; all strength depends upon the power of the different
parts of the body to contract, and nothing will aid the operation so
much as the different appliances here made use of. As a rule, I
observed that when rupture exudes, the sweating process should be
resorted to; when perspiration has broken out, gently rub the part
with the hand until the rupture is gone in again. Bandages are worn
continually.
XCVII.—Chilblains.
Rub the feet or hands affected for a quarter of an hour in tepid
water three times a-day, and bandage the leg from ankle to knee if
in the feet. If in the hand, the arm from wrist to elbow.
XCVIII.—Cold Feet.
Take a shallow foot-bath, cold, one inch deep, before going to bed,
for fifteen minutes; let the feet be well rubbed the whole time, then
walk about the room bare-footed for half an hour, so that re-action
may take place, or they will be colder than before.
XCIX.—Eruption, Scabs, and Sores on the Arms.
A child had tried sulphur bandages and all other conceivable means:
—
Morning, noon, and afternoon, packing-sheet and tepid-bath; the
latter after a few days changed to the cold-bath; bandages night and
day; cure effected in a few weeks.
C.—Consumption.
Until the age of fifteen or sixteen Priessnitz conceives this complaint
to be always curable. Very often when parties are supposed to be
consumptive, they are not so. A young lady arrived at Gräfenberg
during my stay there. I thought she had delayed it too long; she
appeared in the last stage of consumption. Priessnitz however took
the case—and, principally with rubbing-sheets, administered three
times a-day, effected an extraordinary cure in two months. I saw this
lady afterwards at Florence, and was quite surprised to see what an
extremely fine woman she had become.
There was also a young lady suffering under the following
symptoms:—great debility, very thin, weak eyes, little or no appetite,
and a short cough, which would awaken her about four o’clock in the
morning, and trouble her the whole day. She was considered by
M.D.’s as consumptive. Priessnitz took a different view of the case,
and as she was cured in two months he was right. Her treatment
was as follows:—
Morning, packing-sheet and plunge bath, the tepid-bath having been
used only for a short time; at ten o’clock, douche; at eleven,
rubbing-sheet and eye-bath; at five, packing-sheet and bath; chest,
waist, and forehead bandaged every night; waist bandaged always.
Consumption of the Nerves.—A gentleman aged 30, came to
Gräfenberg in a most deplorable state, supported on one side by his
wife, on the other by his servant. Second night he was taken
alarmingly ill, with a fever and a stoppage in his bowels. He was too
weak for a packing-sheet or tepid-bath, therefore twelve rubbing-
sheets were administered within three hours; and two head-baths
during the intermediate times. When a change for the better took
place, enemas were applied and relief afforded. The next day patient
was out of doors. I left Gräfenberg about this time, therefore do not
know if he recovered.
Spitting Blood.—A young lady was subjected to spitting blood, pain
at the chest, and general debility. Priessnitz doubted if the lungs
were affected, and tried packing-sheet and tepid-bath, which patient
was found too weak to support. Then rubbing-sheets twice a day;
patient still too weak. Then rubbing-sheet, and tepid sitz-bath ten
minutes. Feverish excitement and loss of appetite came on. Back of
head put into cold water for quarter of an hour; to be repeated
several times a day. Bandage at all times down the middle of the
breast and round the waist. When spitting of blood came on, then
cold foot-baths were resorted to. Patient tried the treatment for a
month, but was not much improved by it.
On leaving, Priessnitz advised her to spend the winter in Italy, to eat
nothing but bread and grapes, and to use cold ablutions.
CI.—Insanity.
This disease, Priessnitz says is curable, when it proceeds from bodily
suffering or disease; but when caused by mental suffering or
misfortune, is generally incurable. I witnessed the treatment of a
case of aberration of mind at Gräfenberg; the patient was put into a
tepid-bath, held there, and rubbed for nine hours and a half; he was
then put to bed, and next morning awoke perfectly composed.
Hydrophobia.—Dr. Short in 1656, published a work, in which he
stated, that with cold water, he had cured the bite of mad dogs and
dropsy. Priessnitz says he never treated the human subject for this
complaint, but that he had cured a dog, by tying him up and
throwing a large number of pails of water over him. At first it caused
him to shiver a great deal, proving the absence of fever to any
extent. When dry the aspersion was repeated; the shivering
diminished at each successive aspersion, until it was entirely allayed.
If, on throwing a dog, thus treated, bread, and he will eat it, it is a
sign he is cured. Dr. Sully, of Wivelscombe, in a work published some
years ago, states, that he dropped water constantly on the wounded
part, and that it invariably acted as a preventive. My impression is,
that hydropathy is adapted to the cure of this complaint.
CII.—Cholera.
Spasmodic or pestilential cholera first appeared in England in 1831,
and in France in 1832; great difference of opinion exists as to its
cause, and hardly two practitioners agree as to the best way to
effect a cure. Some persons think, as many would get well without
medical aid as with it; and this conjecture is supported by what took
place on its visitation in Dublin. The numbers attacked were so
great, that for the humble class, large tents were erected outside the
city, and the medical men were so harassed by their own connexions
within it, that the poor were left very much to fate. On comparing
notes of the mortality that took place, it was found, that the number
of deaths of those who received medical aid, and those who were
deprived of it, were about equal. Pages might be employed in
enumerating instances related, in which the cholera was cured by
cold water, though administered without reference to any
hydropathic rules. In 1832, Cholera made great ravages in Silesia,
when numbers at Freywaldau and the neighbourhood, fell victims.
Priessnitz’s patients did not escape, though they avoided its fatal
consequences. A friend of mine, who was at Gräfenberg at the time,
assures me that in cholera, Priessnitz never lost a case, though
seventeen of his patients, and many persons in the neighbourhood,
were treated by him. My landlord at Freywaldau, confirmed the last
of these statements, and said that his daughter fell a victim, who, he
felt persuaded, would have recovered, had she been treated with
water instead of drugs.
To ward off this disease, and place the system, if attacked, in the
best condition to resist it, we ask the dispassionate reader, are not
hydropathic rules in accordance with reason and common sense?
There are three different stages in cholera; the first is that of a
common diarrhœa, accompanied with oppression of the chest,
anxiety, and collapse of the face; if neglected, it assumes a more
serious form, the pulse becomes weak, and there is a difficulty of
respiration.
The second stage is ushered in by giddiness, great depression of
pulse and of the vital energies, with spasms, accompanied by
purging and vomitings.
In the third stage, the patient is suddenly laid prostrate, serous fluid,
in large quantities, is discharged from the bowels and stomach, with
cramps and spasms, hardly any pulse, and difficult respiration.
Under ordinary treatment, this frequently terminates life in a few
hours.
To those who have witnessed the wonderful results of the Water-
cure treatment in cholic, diarrhœa, &c., it must be evident, that in
the primary stages of this malady, the treatment resorted to in those
complaints, would be perfectly effectual; and that cholera, in its
worst and most fearful form, is to be successfully combated by no
other than hydropathic means.
If, after visiting a contagious case, Mr. Priessnitz feels at all
uncomfortable, he takes a packing-sheet and tepid-bath.
Asiatic Cholera.—On the first appearance of Cholera symptoms,
which are generally those of languor and chilliness, do not wait for a
development, but apply most vigorously a rubbing-sheet; then dry
the body, and administer a clyster of cold water. In two or three
minutes repeat the rubbing-sheet and clyster, wait five minutes and
repeat the same a third time. Then a cold sitz-bath, letting two
attendants rub the patient with hands dipped in water, particularly
on the abdomen, the whole time; water should be drunk whilst in
the sitz-bath, until patient vomits; when cramps in the stomach and
vomiting have subsided, place a large bandage round the body, and
put him to bed well covered up. After sleeping, apply a tepid-bath
with friction for some time. If not cured, renew the whole operation.
If, after the sitz-bath, cholera appears on the advance, warm a
blanket, and pack the patient as in the sweating process; if he
remains therein several hours, and the symptoms do not decrease,
renew the whole proceedings, and again try to produce perspiration;
when effected, keep it up two or three hours. After this a tepid-bath
62° with friction. The success of the treatment very much depends
upon drinking abundantly of water. The bandages used, should be
doubled or trebled, and changed often. If patient is unable to stand
or sit upright, lay him on a bed, and let several attendants rub him
all over with wet hands.
Extract from a letter from Dr. Gibbs to the editor of the “Water-cure
Journal.”
“You cannot have forgotten the consternation of the profession when
this fearful disease invaded us in 1832. Neither can you be ignorant
that the faculty, generally, are as ill prepared to contend with it now
as they were in former years; but for the information of those who
may not be as well acquainted with such matters as you must be, I
beg to make an extract from the minutes of the proceedings at a
meeting of the Western Medical and Surgical Association, as
reported in the Lancet of September 19, 1846. In the course of a
discussion on the treatment of cholera, Dr. Cahill said, that he
‘positively felt a creeping of the skin at the relation of the enormities
which had been perpetrated by practitioners upon their patients.
When he listened to the recital of practitioners who described the
extravagant cases of mercury and of opium which they administered,
he could not refrain from fancying that he was witnessing the orgies
of so many Indian savages, whilst counting the scalps of their
victims. He thought it a pity that the invention of such a system of
torture should not experience the fate of the inventor of the brazen
bull, and illustrate upon his own person the efficacy of his infernal
ingenuity. He believed that in the majority of persons who died of
Asiatic cholera, death was the consequence of the treatment rather
than of the disease. He had seen above a thousand cases of Asiatic
cholera; and in no instance had he seen any benefit from any mode
of treatment. On the contrary, he had seen persons die of narcotism,
who would have survived if left to the vis medicatrix naturæ. He had
seen others die of absorption of air through the veins when the
saline fluid was ejected; and he knew many who had had the
extraordinary luck to escape both the doctor and the disease, yet
rendered miserable for the remainder of life by the effects of the
immense doses of mercury which had been given to them during the
cholera paroxysms. In fact, it was afflicting to contemplate the
sufferings which the rash and empirical practice of the profession in
the management of this epidemic had created.’The learned
gentleman likewise said ‘With respect to cholera, since nothing was
known of its nature, and no treatment had any influence over it, the
best plan was to do as little as possible: give carrara, soda, or pump-
water, with a little laudanum, perhaps in the diarrhœal stage, and
the patient would not be deprived of the chance which nature had
given him.’
“It is to be presumed that the doctor had not seen this disease
treated by the Water-cure, under the operation of which, if I am
correctly informed, and as I can readily believe, results very different
from those, which he witnessed, were obtained. It is stated that
more than twenty cases were successfully treated by Priessnitz, and
between thirty and forty at Breslau, by a clergyman, whose name I
regret that I have forgotten; and it is added that neither practitioner
lost a patient by death. The treatment adopted by each of them was
nearly the same; the principal difference between them being, that
the one employed the sitz-bath, and the other the shallow tepid-
bath.
[“]If on the appearance of the premonitory symptoms, judicious
treatment be promptly adopted, it seems not improbable that the
disease may be cut short. Those symptoms may be any combination
of the following:—shivering, dizziness, a ringing noise in the ears, a
small quick pulse, accelerated respiration, languor, præcordial
anxiety, a cold white tongue, nausea, vomiting, severe gripings, and
watery diarrhœa. If it be not checked, the disease quickly passes
into the second or algid stage; the circulation becomes feeble, the
blood is drained of its fluid, the muscles are contracted and
cramped, the tongue is colder and whiter, the thirst becomes
burning, the lips livid; the features contracted, the extremities
shrivelled, and the skin cold, clammy, and discoloured.
“Little is known respecting the nature of this disease; but the most
rational opinion seems to be, that it owes its origin to a poison
pervading the blood; deranging the balance between the arterial and
venous circulation, impairing the nervous energy, and impeding all
the functions of the various organs, excepting the secretions from
the stomach and bowels; the preternatural excitement of which
would seem to indicate an effort of nature to expel the disturbing
causes from the system. This opinion obtains additional probability
from the fact, which often has been observed, that the more profuse
is the diarrhœa, the less fatal is the disease.
“Cholera may suddenly appear without manifesting any, or at least
with very slight, premonitory symptoms; especially where the patient
is labouring under any serious affection of the brain, lungs, or air-
passages, when it will sometimes graft itself on the primary disease,
and aggravate all its most various symptoms.
“On the first manifestation of premonitory symptoms, immediate
recourse should be had to repeated friction in a wrung-out sheet, as
in the earlier stages of fever. This will tend to stimulate the nervous
energy, and to maintain or re-establish the balance of circulation
between the arterial and venous systems; will counteract the
disposition to internal congestion by promoting cuticular circulation;
will aid the lungs by freeing the exhalants of the skin, and will
forward the elimination of the virus through the same channels.
“But it will not be sufficient merely to attempt to resist the
encroachments of the disease; the efforts of nature to expel the
cause of it, also claim assistance. To this end cold or tepid water
should be freely drunk to facilitate the vomiting, to dilute and
weaken the action of the poison, to stimulate the kidneys, and to
supply the waste of fluid in the blood. Dr. Rutty, in his synopsis,
says, ‘It [the drinking of water] has also frequently been found
efficacious in stopping violent vomitings and purgings, partly as a
diluent, and partly as a bracer to the fibres; and in violent,
deplorable choleras, cold water is recommended by the ancients,
and at this time is ordered by Spanish physicians with good success,
though Celsus orders it warm.’
“Enemata of pure water, tepid or cold, should likewise be freely
administered; the quantity administered to an infant at one time
should not exceed two ounces; four ounces would be sufficient for a
child six years old; eight ounces for a youth of fifteen, and fifteen or
sixteen ounces for an adult.
“But the principal process is long and entire friction, either in the
shallow tepid-bath or in the sitz-bath. The latter seems to deserve
the preference, inasmuch as it will more directly and powerfully aid
nature in her efforts; its primary action being that of a purgative,
while a less body of water will suffice, than could be made to fulfil
the same intention in a vessel of the shape and size of the half bath;
but, if the sitz-bath be employed, then friction with wet hands
should be applied to the extremities. Cold water may be used in the
sitz-bath, provided that there is nothing in the previous state of the
patient to contra-indicate its use; in which case tepid water must be
employed. Tepid water about 70° Fahr. may likewise be employed in
the shallow bath, as the body of water therein must be greater than
the sitz-bath; but warm applications are never indicated. Vapour-
baths have been tried to recall the circulation to the surface, but
without effect. On this point, Dr. Daun in his ‘Medical Reports on
Cholera,’ says, ‘O’Brien lay on the steam couch for three hours
before he expired, in a heat that I am convinced would have raised a
lifeless body to a temperature nearly, if not equal, to that of a
Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.
More than just a book-buying platform, we strive to be a bridge
connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.
Join us on a journey of knowledge exploration, passion nurturing, and
personal growth every day!
ebookbell.com

More Related Content

PDF
Parallel Programming With Microsoft Net Design Patterns For Decomposition And...
PDF
Professional Microsoft Smartphone Programming 1st Edition Baijian Yang
PDF
Essential C 8 0 7th Edition Mark Michaelis Kevin Bost Editor Eric Lippert Editor
PDF
Professional Crossplatform Mobile Development In C Scott Olson
PDF
Professional C and NET 2021st Edition Christian Nagel
PDF
Algorithms And Parallel Computing 1st Edition Fayez Gebali
PDF
OCA Oracle Certified Associate Java SE 8 Programmer I Study Guide.pdf
PDF
Professional Visual Studio 2005 Andrew Parsons
Parallel Programming With Microsoft Net Design Patterns For Decomposition And...
Professional Microsoft Smartphone Programming 1st Edition Baijian Yang
Essential C 8 0 7th Edition Mark Michaelis Kevin Bost Editor Eric Lippert Editor
Professional Crossplatform Mobile Development In C Scott Olson
Professional C and NET 2021st Edition Christian Nagel
Algorithms And Parallel Computing 1st Edition Fayez Gebali
OCA Oracle Certified Associate Java SE 8 Programmer I Study Guide.pdf
Professional Visual Studio 2005 Andrew Parsons

Similar to Professional Parallel Programming With C Master Parallel Extensions With Net 4 Gaston Hillar (20)

PDF
Hungry minds, inc csharp complus programming
PDF
Essential C 70 6th Edition Mark Michaelis Eric Lippert Editor
PDF
Essential C 70 6th Edition Mark Michaelis Eric Lippert Editor
PDF
Algorithms and Parallel Computing 1st Edition Fayez Gebali
PDF
NET Programming A Practical Guide Using C 1st Edition Pradeep Tapadiya
PDF
Clean Architectures in Python.pdf
PDF
ASP.NET 8 Best Practices 1 / converted Edition Jonathan R. Danylko
DOCX
ffirs.indd iffirs.indd i 280712 610 PM280712
PDF
NET Programming A Practical Guide Using C 1st Edition Pradeep Tapadiya
PDF
NET Programming A Practical Guide Using C 1st Edition Pradeep Tapadiya
PDF
C Cookbook Modern Recipes For Professional Developers 1st Edition Joe Mayo
PDF
An Introduction to Parallel Programming 2. Edition Pacheco
PDF
C Interview Guide Boost Your Confidence With Answers To Hundreds Of Secret In...
PDF
Clean Architecture With Net For True Epub Dino Esposito
PDF
C for Financial Markets The Wiley Finance Series Daniel J. Duffy
PDF
Mastering Windows Hyper-V-2016.pdf
PDF
Architecture Patterns with Python 1st Edition Harry Percival
DOCX
Deitel® Series PageHow To Program SeriesAn.docx
PDF
Ebook c sharp_2008
PDF
Microsoft Visual C Step By Step Ninth Edition John Sharp
Hungry minds, inc csharp complus programming
Essential C 70 6th Edition Mark Michaelis Eric Lippert Editor
Essential C 70 6th Edition Mark Michaelis Eric Lippert Editor
Algorithms and Parallel Computing 1st Edition Fayez Gebali
NET Programming A Practical Guide Using C 1st Edition Pradeep Tapadiya
Clean Architectures in Python.pdf
ASP.NET 8 Best Practices 1 / converted Edition Jonathan R. Danylko
ffirs.indd iffirs.indd i 280712 610 PM280712
NET Programming A Practical Guide Using C 1st Edition Pradeep Tapadiya
NET Programming A Practical Guide Using C 1st Edition Pradeep Tapadiya
C Cookbook Modern Recipes For Professional Developers 1st Edition Joe Mayo
An Introduction to Parallel Programming 2. Edition Pacheco
C Interview Guide Boost Your Confidence With Answers To Hundreds Of Secret In...
Clean Architecture With Net For True Epub Dino Esposito
C for Financial Markets The Wiley Finance Series Daniel J. Duffy
Mastering Windows Hyper-V-2016.pdf
Architecture Patterns with Python 1st Edition Harry Percival
Deitel® Series PageHow To Program SeriesAn.docx
Ebook c sharp_2008
Microsoft Visual C Step By Step Ninth Edition John Sharp
Ad

Recently uploaded (20)

PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PDF
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Classroom Observation Tools for Teachers
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
Orientation - ARALprogram of Deped to the Parents.pptx
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Computing-Curriculum for Schools in Ghana
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PPTX
Cell Types and Its function , kingdom of life
PDF
A systematic review of self-coping strategies used by university students to ...
DOC
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
PPTX
Lesson notes of climatology university.
PPTX
Final Presentation General Medicine 03-08-2024.pptx
202450812 BayCHI UCSC-SV 20250812 v17.pptx
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
VCE English Exam - Section C Student Revision Booklet
Final Presentation General Medicine 03-08-2024.pptx
Classroom Observation Tools for Teachers
Module 4: Burden of Disease Tutorial Slides S2 2025
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
FourierSeries-QuestionsWithAnswers(Part-A).pdf
2.FourierTransform-ShortQuestionswithAnswers.pdf
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Orientation - ARALprogram of Deped to the Parents.pptx
Microbial diseases, their pathogenesis and prophylaxis
Computing-Curriculum for Schools in Ghana
O5-L3 Freight Transport Ops (International) V1.pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Cell Types and Its function , kingdom of life
A systematic review of self-coping strategies used by university students to ...
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
Lesson notes of climatology university.
Final Presentation General Medicine 03-08-2024.pptx
Ad

Professional Parallel Programming With C Master Parallel Extensions With Net 4 Gaston Hillar

  • 1. Professional Parallel Programming With C Master Parallel Extensions With Net 4 Gaston Hillar download https://ebookbell.com/product/professional-parallel-programming- with-c-master-parallel-extensions-with-net-4-gaston- hillar-2310226 Explore and download more ebooks at ebookbell.com
  • 2. Here are some recommended products that we believe you will be interested in. You can click the link to download. Patterns For Parallel Programming Mattson Timothy Gsanders Beverly Amassingill https://ebookbell.com/product/patterns-for-parallel-programming- mattson-timothy-gsanders-beverly-amassingill-21355032 Parallel And Distributed Programming Using C 1st Edition Cameron Hughes https://ebookbell.com/product/parallel-and-distributed-programming- using-c-1st-edition-cameron-hughes-977568 Modern Computational Finance Aad And Parallel Simulations With Professional Implementation In C Antoine Savine https://ebookbell.com/product/modern-computational-finance-aad-and- parallel-simulations-with-professional-implementation-in-c-antoine- savine-7260582 How To Grow Organic Plants Indoors With Led Lights For Beginners And Advanced Gardeners Advice From A Professional Grower Ryan Crippen https://ebookbell.com/product/how-to-grow-organic-plants-indoors-with- led-lights-for-beginners-and-advanced-gardeners-advice-from-a- professional-grower-ryan-crippen-11182164
  • 3. How To Grow Organic Plants Indoors With Led Lights For Beginners And Advanced Gardeners Advice From A Professional Grower Ryan Crippen https://ebookbell.com/product/how-to-grow-organic-plants-indoors-with- led-lights-for-beginners-and-advanced-gardeners-advice-from-a- professional-grower-ryan-crippen-22335662 Professional Selling 1st Edition Deeter Hunter Loe Rich Mullins https://ebookbell.com/product/professional-selling-1st-edition-deeter- hunter-loe-rich-mullins-44874378 Professional Lamp Linux Apache Mysql And Php Web Development Jason Gerner https://ebookbell.com/product/professional-lamp-linux-apache-mysql- and-php-web-development-jason-gerner-46096180 Professional Communication At Work Interpersonal Strategies For Career Success Joseph L Chesebro https://ebookbell.com/product/professional-communication-at-work- interpersonal-strategies-for-career-success-joseph-l-chesebro-46779314 Professional Identity Crisis Andrea Tomo https://ebookbell.com/product/professional-identity-crisis-andrea- tomo-46835298
  • 6. Demonstration copy of activePDF Toolkit (http://www.activepdf.com)
  • 7. PROFESSIONAL PARALLEL PROGRAMMING WITH C# FOREWORD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix INTRODUCTION. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi CHAPTER 1 Task-Based Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 CHAPTER 2 Imperative Data Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 CHAPTER 3 Imperative Task Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 CHAPTER 4 Concurrent Collections. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 CHAPTER 5 Coordination Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 CHAPTER 6 PLINQ: Declarative Data Parallelism. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 CHAPTER 7 Visual Studio 2010 Task Debugging Capabilities . . . . . . . . . . . . . . . . . 275 CHAPTER 8 Thread Pools. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .317 CHAPTER 9 Asynchronous Programming Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361 CHAPTER 10 Parallel Testing and Tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399 CHAPTER 11 Vectorization, SIMD Instructions, and Additional Parallel Libraries . . 443 APPENDIX A .NET 4 Parallelism Class Diagrams. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469 APPENDIX B Concurrent UML Models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487 APPENDIX C Parallel Extensions Extras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497 INDEX. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .521
  • 9. PROFESSIONAL Parallel Programming with C# MASTER PARALLEL EXTENSIONS WITH .NET 4 Gastón C. Hillar
  • 10. Professional Parallel Programming with C# Published by Wiley Publishing, Inc. 10475 Crosspoint Boulevard Indianapolis, IN 46256 www.wiley.com Copyright © 2011 by Wiley Publishing, Inc., Indianapolis, Indiana Published by Wiley Publishing, Inc., Indianapolis, Indiana Published simultaneously in Canada ISBN: 978-0-470-49599-5 ISBN: 978-1-118-02812-4 (ebk) ISBN: 978-1-118-02977-0 (ebk) ISBN: 978-1-118-02978-7 (ebk) Manufactured in the United States of America 10 9 8 7 6 5 4 3 2 1 No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748- 6008, or online at http://www.wiley.com/go/permissions. Limit of Liability/Disclaimer of Warranty: The publisher and the author make no representations or warranties with respect to the accuracy or completeness of the contents of this work, and specifically disclaim all warranties, including, without limitation, warranties of fitness for a particular purpose. No warranty may be created or extended by sales or promotional materials. The advice and strategies contained herein may not be suitable for every situation. This work is sold with the understanding that the publisher is not engaged in rendering legal, accounting, or other professional ser- vices. If professional assistance is required, the services of a competent professional person should be sought. Neither the publisher nor the author shall be liable for damages arising herefrom. The fact that an organization or Web site is referred to in this work as a citation and/or a potential source of further information does not mean that the author or the pub- lisher endorses the information the organization or website may provide or recommendations it may make. Further, read- ers should be aware that Internet websites listed in this work may have changed or disappeared between when this work was written and when it is read. For general information on our other products and services, please contact our Customer Care Department within the United States at (877) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books. Library of Congress Control Number: 2010930961 Trademarks: Wiley, the Wiley logo, Wrox, the Wrox logo, and Wrox Programmer to Programmer are trademarks or regis- tered trademarks of John Wiley & Sons, Inc. and/or its affiliates, in the United States and other countries, and may not be used without written permission. All other trademarks are the property of their respective owners. Wiley Publishing, Inc. is not associated with any product or vendor mentioned in this book.
  • 11. To my wonderful wife, Vanesa, who has somehow learned to put up with marathon writing sessions. And to my loving son, Kevin, who always managed to put a smile on my face after a long day.
  • 12. CREDITS ACQUISITIONS EDITOR Paul Reese PROJECT EDITORS Ed Connor Ginny Munroe TECHNICAL EDITOR Doug Parsons PRODUCTION EDITOR Kathleen Wisor COPY EDITOR Kathryn Duggan EDITORIAL DIRECTOR Robyn B. Siesky EDITORIAL MANAGER Mary Beth Wakefield FREELANCER EDITORIAL MANAGER Rosemarie Graham ASSOCIATE DIRECTOR OF MARKETING David Mayhew PRODUCTION MANAGER Tim Tate VICE PRESIDENT AND EXECUTIVE GROUP PUBLISHER Richard Swadley VICE PRESIDENT AND EXECUTIVE PUBLISHER Barry Pruett ASSOCIATE PUBLISHER Jim Minatel PROJECT COORDINATOR, COVER Katie Crocker PROOFREADER Paul Sagan, Word One New York INDEXER Robert Swanson COVER IMAGE © David Marchal/istockphoto.com COVER DESIGNER Michael Trent
  • 13. ABOUT THE AUTHOR GASTÓN C. HILLAR has been working with computers since he was eight. He began programming with the legendary Texas TI-99/4A and Commodore 64 home computers in the early ’80s. He received a bachelor’s degree from UADE University, where he graduated with honors, and a Master of Business Administration from UCEMA University, where he graduated with an out- standing thesis. Gastón has been researching parallel programming, multiprocessor, and multicore since 1997. He has 14 years of experience designing and developing diverse types of complex parallelized solutions that take advantage of multicore with C# and .NET Framework. He has been working with Parallel Extensions since the first Community Technology Preview (CTP). He is heavily involved with con- sulting, training, and development on the .NET platform, focusing on creating efficient software for modern hardware. He regularly speaks on software development at conferences all over the world. In 2009, he was awarded an Intel® Black Belt Software Developer award. Gastón has written four books in English, contributed chapters to two other books, and has written more than 40 books in Spanish. He contributes to Dr Dobb’s at www.drdobbs.com and Dr. Dobb’s Go Parallel programming portal at www.ddj.com/go-parallel/, and is a guest blogger at Intel Software Network (http://software.intel.com). He has worked as a developer, architect, and project manager for many companies in Buenos Aires, Argentina. Now, he is an independent IT consultant working for several American, German, Spanish, and Latin American companies, and a freelance author. He is always looking for new adventures around the world. He lives with his wife, Vanesa, and his son, Kevin. When not tinkering with computers, he enjoys developing and playing with wireless virtual reality devices and electronics toys with his father, his son, and his nephew, Nico. You can reach him at: [email protected] and follow him on Twitter at: http:// twitter.com/gastonhillar. Gastón’s blog is at http://csharpmulticore.blogspot.com. ABOUT THE TECHNICAL EDITOR DOUG PARSONS is a software architect and the director of Ohio Operations for NJI New Media. His expertise is in web development with a specialization in political websites. Most notably, he has worked on the 2008 John McCain presidential campaign website, and more recently, he has worked on Mitt Romney’s official book tour website. In his down time, he enjoys spending time with his lovely fiancée, Marisa, and their puppies.
  • 15. ACKNOWLEDGMENTS PARALLEL PROGRAMMING IS ONE of the most difficult topics to write about. It is usually difficult to isolate subjects without having to reference many closely related topics. However, I had a lot of help during all the necessary stages to produce a high-quality book on a very challenging topic. Special thanks go to Paul Reese, Edward Connor, Ginny Munroe, and Rosemarie Graham — they had a lot of patience, and they allowed me to make the necessary changes to the chapters in order to include the most accurate and appropriate information. The book required a lot of work, and they understood that writing an advanced book about parallel programming is a bit different from writing books about other programming topics. They worked very hard to make this book possible. In addition, I must thank Doug Parsons and Kathryn Duggan. You will notice their improvements when you read each sentence. They allowed me to convert a draft into a proof with their valuable feedback. I wish to acknowledge Stephen Toub, Principal Program Manager of the Parallel Computing Platform team at Microsoft, who provided me with invaluable feedback for all the chapters. I was able to improve the examples and the contents by incorporating Stephen’s insightful comments. This book would have been very difficult to finish without Stephen’s help. His team’s blog is at http:// blogs.msdn.com/b/pfxteam/. The blog is an essential resource for keeping up-to-date with Parallel Extensions improvements and usage. I must also thank Daniel Moth, member of the Microsoft Technical Computing group. Daniel helped me to improve the chapter that covers the new and exciting debugging features included in Visual Studio 2010. His feedback allowed me to incorporate a fascinating chapter into this book. Special thanks go to Aaron Tersteeg and Kathy Farrel, managers of the excellent Parallel Programming community at Intel Software Network. I had the opportunity to enrich my knowledge in parallel computing topics through this great community. I wouldn’t have been able to write this book without listening to and watching the Parallel Programming Talk shows (www.intel.com/software/ parallelprogrammingtalk) that kept me up-to-date with the parallel computing trends. Some of the information in this book is the result of intensive discussions I had at the Intel Black Belt Annual Meetups — I would like to acknowledge James Reinders, Dr. Clay Breshears, Jim Dempsey, and Doug Holland for sharing their wisdom. Doug also shares with me a passion for .NET, and I learned a great deal about his first experiences with Parallel Extensions via his blog: An Architect’s Perspective (http://blogs.msdn.com/b/dohollan/). I must also thank Jon Erickson, editor of the Dr. Dobb’s website (www.drdobbs.com). Jon gave me the opportunity to contribute to both Dr. Dobb’s and Dr. Dobb’s Go Parallel (www.ddj.com/go-parallel/) in order to share my experience with other developers and architects. This book incorporates the great feedback received from my contributions. I wish to acknowledge Hector A. Algarra, who always helped me to improve my writing skills.
  • 16. x ACKNOWLEDGMENTS Special thanks go to my wife, Vanesa S. Olsen, my son, Kevin, my nephew, Nicolas, my father, Jose Carlos, my sister, Silvina, and my mother, Susana. They were my greatest supporters during the production of this book. And finally, thanks to all of you for selecting this book. I hope the parallel programming knowledge that you gain from it will help you develop powerful, high-performance applications, and responsive user interfaces.
  • 17. CONTENTS FOREWORD xix INTRODUCTION xxi CHAPTER 1: TASK-BASED PROGRAMMING 1 Working with Shared-Memory Multicore 2 Differences Between Shared-Memory Multicore and Distributed-Memory Systems 3 Parallel Programming and Multicore Programming 4 Understanding Hardware Threads and Software Threads 5 Understanding Amdahl’s Law 10 Considering Gustafson’s Law 13 Working with Lightweight Concurrency 16 Creating Successful Task-Based Designs 17 Designing With Concurrency in Mind 18 Understanding the Differences between Interleaved Concurrency, Concurrency, and Parallelism 19 Parallelizing Tasks 19 Minimizing Critical Sections 21 Understanding Rules for Parallel Programming for Multicore 22 Preparing for NUMA and Higher Scalability 22 Deciding the Convenience of Going Parallel 27 Summary 28 CHAPTER 2: IMPERATIVE DATA PARALLELISM 29 Launching Parallel Tasks 30 System.Threading.Tasks.Parallel Class 31 Parallel.Invoke 32 No Specific Execution Order 33 Advantages and Trade-Offs 37 Interleaved Concurrency and Concurrency 38 Transforming Sequential Code to Parallel Code 40 Detecting Parallelizable Hotspots 40 Measuring Speedups Achieved by Parallel Execution 43 Understanding the Concurrent Execution 45 Parallelizing Loops 45
  • 18. xii CONTENTS Parallel.For 46 Refactoring an Existing Sequential Loop 48 Measuring Scalability 50 Working with Embarrassingly Parallel Problems 52 Parallel.ForEach 52 Working with Partitions in a Parallel Loop 54 Optimizing the Partitions According to the Number of Cores 56 Working with IEnumerable Sources of Data 58 Exiting from Parallel Loops 60 Understanding ParallelLoopState 62 Analyzing the Results of a Parallel Loop Execution 63 Catching Exceptions that Occur Inside Parallel Loops 64 Specifying the Desired Degree of Parallelism 66 ParallelOptions 66 Counting Hardware Threads 69 Logical Cores Aren’t Physical Cores 70 Using Gantt Charts to Detect Critical Sections 71 Summary 72 CHAPTER 3: IMPERATIVE TASK PARALLELISM 73 Creating and Managing Tasks 74 System.Theading.Tasks.Task 75 Understanding a Task’s Status and Lifecycle 77 TaskStatus: Initial States 77 TaskStatus: Final States 78 Using Tasks to Parallelize Code 78 Starting Tasks 79 Visualizing Tasks Using Parallel Tasks and Parallel Stacks 80 Waiting for Tasks to Finish 85 Forgetting About Complex Threads 85 Cancelling Tasks Using Tokens 86 CancellationTokenSource 89 CancellationToken 89 TaskFactory 90 Handling Exceptions Thrown by Tasks 91 Returning Values from Tasks 92 TaskCreationOptions 95 Chaining Multiple Tasks Using Continuations 95 Mixing Parallel and Sequential Code with Continuations 97 Working with Complex Continuations 97 TaskContinuationOptions 98
  • 19. xiii CONTENTS Programming Complex Parallel Algorithms with Critical Sections Using Tasks 100 Preparing the Code for Concurrency and Parallelism 101 Summary 101 CHAPTER 4: CONCURRENT COLLECTIONS 103 Understanding the Features Offered by Concurrent Collections 104 System.Collections.Concurrent 107 ConcurrentQueue 107 Understanding a Parallel Producer-Consumer Pattern 111 Working with Multiple Producers and Consumers 115 Designing Pipelines by Using Concurrent Collections 120 ConcurrentStack 121 Transforming Arrays and Unsafe Collections into Concurrent Collections 128 ConcurrentBag 129 IProducerConsumerCollection 136 BlockingCollection 137 Cancelling Operations on a BlockingCollection 142 Implementing a Filtering Pipeline with Many BlockingCollection Instances 144 ConcurrentDictionary 150 Summary 155 CHAPTER 5: COORDINATION DATA STRUCTURES 157 Using Cars and Lanes to Understand the Concurrency Nightmares 158 Undesired Side Effects 158 Race Conditions 159 Deadlocks 160 A Lock-Free Algorithm with Atomic Operations 161 A Lock-Free Algorithm with Local Storage 162 Understanding New Synchronization Mechanisms 163 Working with Synchronization Primitives 164 Synchronizing Concurrent Tasks with Barriers 165 Barrier and ContinueWhenAll 171 Catching Exceptions in all Participating Tasks 172 Working with Timeouts 173 Working with a Dynamic Number of Participants 178 Working with Mutual-Exclusion Locks 179 Working with Monitor 182 D ownload from Wow! eBook <www.wowebook.com>
  • 20. xiv CONTENTS Working with Timeouts for Locks 184 Refactoring Code to Avoid Locks 187 Using Spin Locks as Mutual-Exclusion Lock Primitives 190 Working with Timeouts 193 Working with Spin-Based Waiting 194 Spinning and Yielding 197 Using the Volatile Modifier 200 Working with Lightweight Manual Reset Events 201 Working with ManualResetEventSlim to Spin and Wait 201 Working with Timeouts and Cancellations 206 Working with ManualResetEvent 210 Limiting Concurrency to Access a Resource 211 Working with SemaphoreSlim 212 Working with Timeouts and Cancellations 216 Working with Semaphore 216 Simplifying Dynamic Fork and Join Scenarios with CountdownEvent 219 Working with Atomic Operations 223 Summary 228 CHAPTER 6: PLINQ: DECLARATIVE DATA PARALLELISM 229 Transforming LINQ into PLINQ 230 ParallelEnumerable and Its AsParallel Method 232 AsOrdered and the orderby Clause 233 Specifying the Execution Mode 237 Understanding Partitioning in PLINQ 237 Performing Reduction Operations with PLINQ 242 Creating Custom PLINQ Aggregate Functions 245 Concurrent PLINQ Tasks 249 Cancelling PLINQ 253 Specifying the Desired Degree of Parallelism 255 WithDegreeOfParallelism 255 Measuring Scalability 257 Working with ForAll 259 Differences Between foreach and ForAll 261 Measuring Scalability 261 Configuring How Results Are Returned by Using WithMergeOptions 264 Handling Exceptions Thrown by PLINQ 266 Using PLINQ to Execute MapReduce Algorithms 268 Designing Serial Stages Using PLINQ 271 Locating Processing Bottlenecks 273 Summary 273
  • 21. xv CONTENTS CHAPTER 7: VISUAL STUDIO 2010 TASK DEBUGGING CAPABILITIES 275 Taking Advantage of Multi-Monitor Support 275 Understanding the Parallel Tasks Debugger Window 279 Viewing the Parallel Stacks Diagram 286 Following the Concurrent Code 294 Debugging Anonymous Methods 304 Viewing Methods 305 Viewing Threads in the Source Code 307 Detecting Deadlocks 310 Summary 316 CHAPTER 8: THREAD POOLS 317 Going Downstairs from the Tasks Floor 317 Understanding the New CLR 4 Thread Pool Engine 319 Understanding Global Queues 319 Waiting for Worker Threads to Finish Their Work 329 Tracking a Dynamic Number of Worker Threads 336 Using Tasks Instead of Threads to Queue Jobs 340 Understanding the Relationship Between Tasks and the Thread Pool 343 Understanding Local Queues and the Work-Stealing Algorithm 347 Specifying a Custom Task Scheduler 353 Summary 359 CHAPTER 9: ASYNCHRONOUS PROGRAMMING MODEL 361 Mixing Asynchronous Programming with Tasks 362 Working with TaskFactory.FromAsync 363 Programming Continuations After Asynchronous Methods End 368 Combining Results from Multiple Concurrent Asynchronous Operations 369 Performing Asynchronous WPF UI Updates 371 Performing Asynchronous Windows Forms UI Updates 379 Creating Tasks that Perform EAP Operations 385 Working with TaskCompletionSource 394 Summary 398 CHAPTER 10: PARALLEL TESTING AND TUNING 399 Preparing Parallel Tests 399 Working with Performance Profiling Features 404 Measuring Concurrency 406
  • 22. xvi CONTENTS Solutions to Common Patterns 416 Serialized Execution 416 Lock Contention 419 Lock Convoys 420 Oversubscription 423 Undersubscription 426 Partitioning Problems 428 Workstation Garbage-Collection Overhead 431 Working with the Server Garbage Collector 434 I/O Bottlenecks 434 Main Thread Overload 435 Understanding False Sharing 438 Summary 441 CHAPTER 11: VECTORIZATION, SIMD INSTRUCTIONS, AND ADDITIONAL PARALLEL LIBRARIES 443 Understanding SIMD and Vectorization 443 From MMX to SSE4.x and AVX 446 Using the Intel Math Kernel Library 447 Working with Multicore-Ready, Highly Optimized Software Functions 455 Mixing Task-Based Programming with External Optimized Libraries 456 Generating Pseudo-Random Numbers in Parallel 457 Using Intel Integrated Performance Primitives 461 Summary 468 APPENDIX A: .NET 4 PARALLELISM CLASS DIAGRAMS 469 Task Parallel Library 469 System.Threading.Tasks.Parallel Classes and Structures 469 Task Classes, Enumerations, and Exceptions 471 Data Structures for Coordination in Parallel Programming 472 Concurrent Collection Classes: System.Collections.Concurrent 474 Lightweight Synchronization Primitives 476 Lazy Initialization Classes 477 PLINQ 477 Threading 479 Thread and ThreadPool Classes and Their Exceptions 479 Signaling Classes 479 Threading Structures, Delegates, and Enumerations 480 BackgroundWorker Component 486
  • 23. xvii CONTENTS APPENDIX B: CONCURRENT UML MODELS 487 Structure Diagrams 487 Class Diagram 487 Component Diagram 489 Deployment Diagram 489 Package Diagram 489 Behavior Diagrams 489 Activity Diagram 491 Use Case Diagram 491 Interaction Diagrams 493 Interaction Overview Diagram 493 Sequence Diagram 494 APPENDIX C: PARALLEL EXTENSIONS EXTRAS 497 Inspecting Parallel Extensions Extras 497 Coordination Data Structures 502 Extensions 507 Parallel Algorithms 513 Partitioners 516 Task Schedulers 517 INDEX 521
  • 25. FOREWORD aParllel prgoamrmnig s i ahdr. Hmm, let me try that again. Parallel programming is hard. While truly a silly example, my first sentence exemplifies some of the difficulties we, as developers, face while writing multithreaded code. As I write this foreword, my hands typing on my laptop are effectively two distinct, physical processes, and if you further consider each of my fingers as an individual entity, the count would instead be 10. I’m generally acknowledged to be a fast typist, and in order to type at over 100 words per minute, my brain manages to coordinate all of my fingers, flowing them concurrently toward their next targets, yet still (most of the time) ensuring that their output is correctly serialized according to the spelling of the words my mind is willing my hands to render. I deliberately suspended that coordination to deliver that first sentence, such that my hands were no longer synchronizing correctly. The result is a barely readable representation of my thoughts. Luckily, it was easily debugged. Parallel programming is indeed hard, or at least it has been historically. With the tools that have been available, only a small percentage of software developers have been proficient enough in the art to successfully develop and debug multithreaded applications. And yet, since the advent of modern computers, developers who need to write responsive user interfaces, build scalable services, or take advantage of multiple processing cores for performance have been forced to deal with concurrency, forced to develop software at the level of threads, mutexes, and semaphores. The difficulties here abound: oversubscription, race conditions, deadlocks, live locks, two-step dances, priority inver- sions, lock convoys, contention, and false sharing, just to name a few. With all of these complications and with the recent industry shift toward multicore and manycore, parallel programming has received a lot of attention from companies that build development plat- forms, with Microsoft chief among them. Several years ago, the Parallel Computing Platform team at Microsoft emerged with a clear vision and purpose: to make building parallelized software easier. Developers should be able to easily express the parallelism that exists in their applications and allow the underlying framework, run-time, and operating system to implement that parallelism for them, mapping the expressed parallelism down to the hardware for correct and efficient execution. The first wave of supported components from the Parallel Computing Platform team was released in April 2010 as part of Visual Studio 2010; whether you’re using native or managed code, this release provides foundational work to simplify the development of parallel applications. For developers using managed code, this includes the Task Parallel Library, Parallel LINQ, the new Parallel Stacks and Parallel Tasks debugging windows, a Concurrency Visualizer that yields deep insights into the execution of your multithreaded applications, and more. Even with all of this new support, parallel programming still requires in-depth knowledge. In an age in which communication abounds in the form of 140-character quips, I personally find there’s no bet- ter medium for conveying that in-depth knowledge than in a quality book. Luckily, you’re reading one right now. Here, Gastón Hillar delivers a comprehensive book that covers all aspects of devel- oping parallel applications with Visual Studio 2010 and the .NET Framework 4. From task-based
  • 26. programming to data parallelism to managing shared state to debugging parallel programs, and from the Task Parallel Library to Parallel LINQ to the ThreadPool to new coordination and syn- chronization primitives, Gastón provides a welcome and in-depth tour through the vast support for parallel programming that now exists in .NET Framework 4 and Visual Studio 2010. This book contains information that can provide you with solid foundational knowledge you’ll want when developing parallel applications. Congratulations on taking your first steps into this brave new manycore world. —Stephen Toub Principal Program Manager Parallel Computing Platform Microsoft Corporation Redmond, WA September 2010 FOREWORD
  • 27. INTRODUCTION In 2007, Microsoft released the first Community Technology Preview (CTP) of Parallel Extensions for the .NET Framework. The old .NET Framework multithreading programming model was too complex and heavyweight for the forthcoming multicore and manycore CPUs. I had been research- ing parallel programming, multiprocessor, and multicore since 1997, so I couldn’t help installing the first CTP and trying it. It was obvious that it was going to be an exciting new way of expressing parallelism in future C# versions. Visual Studio 2010 ships with version 4 of the .NET Framework, the first release to include Parallel Extensions. C# 4 and .NET Framework 4 allow you to shift to a modern task-based programming model to express parallelism. It is easier to write code that takes advantage of multicore micropro- cessors. Now, you can write code that scales as the number of available cores increases, without having to work with complex managed threads. You are able to write code that runs tasks, and the Common Language Runtime (CLR) will inject the necessary threads for you. It is easy to run data parallelism algorithms taking advantage of multicore. At the time of this writing, multicore microprocessors are everywhere. Servers, desktop computers, laptops and notebooks, netbooks, mobile Internet devices (MIDs), tablets, and even smartphones use multicore microprocessors. The average number of cores in each microprocessor is going to increase in the forthcoming years. Are you going to lose the opportunity to transform this multicore power into application performance? Parallel programming must become part of your skill set to effectively develop applications for modern hardware in C#. I spent more than three years working with the diverse versions of Parallel Extensions until Visual Studio 2010 was officially released. I enjoyed developing parallelized applications with C#, and I did my best to include explanations for the most common scenarios in this book. Visual Studio 2010 provides an IDE prepared for a parallel developer, and C# is an excellent fit for the new task-based programming model. WHO THIS BOOK IS FOR This book was written to help experienced C# developers transform the multicore power found in modern microprocessors into application performance by using the Parallel Extensions introduced in .NET Framework 4. For those who are just starting the transition from the previous multithread- ing model to those who have worked with concurrent and parallel code for a while and need to gain a deeper understanding, this book provides information on the most common parallel programming skills and concepts you need. This book offers a wide-ranging presentation of parallel programming concepts, but parallel pro- gramming possibilities with C# and .NET Framework 4 are so large and comprehensive that no single book can cover them all. The goal of this book is to provide a working knowledge of key technologies that are important to C# developers who want to take advantage of multicore and
  • 28. xxii INTRODUCTION manycore architectures. It provides adequate knowledge for an experienced C# developer to work in many high-level parallelism areas. The book covers the new task-based programming model. Some developers who are interested in distributed parallelism and low-level concurrent programming top- ics may choose to add to their knowledge by supplementing this book with other books dedicated entirely to these technology areas. This book provides background information that is very important to avoid common paral- lel programming pitfalls; therefore, it is best to read it without skipping chapters. Moreover, you should finish reading a chapter before considering the code shown in the middle of that chapter as a best practice. As each chapter introduces new features for Parallel Extensions, the examples are enhanced with simpler and more efficient coding techniques. This book assumes that you are an experienced C# and .NET Framework 4 developer and focuses on parallel programming topics. If you don’t have experience with advanced object- oriented programming, lists, arrays, closures, delegates, lambda expressions, LINQ, typecast- ing, and basic debugging techniques, you may need additional training to fully understand the examples shown. WHAT THIS BOOK COVERS This book covers the following key technologies and concepts: ‰ Modern multicore and manycore shared-memory architectures ‰ High-level, task-based programming with Task Parallel Library (TPL), C#, and .NET Framework 4 ‰ Parallel Language Integrated Query (PLINQ) ‰ Most common coordination data structures and synchronization primitives for task-based programming ‰ Visual Studio 2010 debugging and profiling capabilities related to parallel programming ‰ Additional libraries, tools, and extras that help you master multicore programming in real- life applications This book does not cover the old multithreaded programming model or distributed parallelism. HOW THIS BOOK IS STRUCTURED It is critical to master certain topics first. Unless you have previous experience with the new task- based programming model introduced in .NET Framework 4, you should read the book chapter by chapter. Each chapter was written with the assumption that you have read the previous chapter. However, if you have previously worked with TPL and PLINQ, you will be able to read and under- stand the content included in the chapters related to parallel debugging and tuning.
  • 29. xxiii INTRODUCTION The book is divided into the following 11 chapters and three appendixes: Chapter 1, “Task-Based Programming” — Explore the new task-based programming model that allows you to introduce parallelism in .NET Framework 4 applications. Parallelism is essential to exploit modern multicore and manycore architectures. This chapter describes the new lightweight concurrency models and important concepts related to concurrency and parallelism. It is important to read this chapter, because it includes the necessary background information in order to prepare your mind for the next 10 chapters and three appendixes. Chapter 2, “Imperative Data Parallelism” — Start learning the new programming models introduced in C# 4 and .NET Framework 4 and apply them with pure data parallel prob- lems. This chapter is about some of the new classes, structures, and enumerations that allow you to deal with data parallelism scenarios. Run the examples to understand the perfor- mance improvements. Chapter 3, “Imperative Task Parallelism” — Start working with the new Task instances to solve imperative task parallelism problems and complex algorithms with simple code. This chapter is about the new classes, structures, and enumerations that allow you to deal with imperative task parallelism scenarios. Implement existing algorithms in parallel using basic and complex features offered by the new task-based programming model. Create parallel code using tasks instead of threads. Chapter 4, “Concurrent Collections” — Task-based programming, imperative data, and task parallelism require arrays, lists, and collections capable of supporting updates concur- rently. Work with the new concurrent collections to simplify the code and to achieve the best performance. This chapter is about the new classes and the new interface that allows you to work with shared concurrent collections from multiple tasks. It explains how to create parallel code that adds, removes, and updates values of different types in lists with diverse ordering schemes and structures. Chapter 5, “Coordination Data Structures” — Synchronize the work performed by diverse concurrent tasks. This chapter covers some classic synchronization primitives and the new lightweight coordination data structures introduced in .NET Framework 4. It is important to learn the different alternatives, so that you can choose the most appropriate one for each concurrency scenario that requires communication and/or synchronization between multi- ple tasks. This is the most complex chapter in the book and one of the most important ones. Be sure to read it before writing complex parallelized algorithms. Chapter 6, “PLINQ: Declarative Data Parallelism” — Work with declarative data parallel- ism using Parallel Language Integrated Query (PLINQ) and its aggregate functions. You can use PLINQ to simplify the code that runs a mix of task and data decomposition. You can also execute the classic parallel Map Reduce algorithm. This chapter combines many of the topics introduced in previous chapters and explains how to transform a LINQ query into a PLINQ query. In addition, the chapter teaches different techniques to tune PLINQ’s parallel execution according to diverse scenarios. Chapter 7, “Visual Studio 2010 Task Debugging Capabilities” — Take full advantage of the new task debugging features introduced in Visual Studio 2010. This chapter describes how the new windows display important information about the tasks and their relationships with the source code and the threads assigned to support their execution. Use these new
  • 30. xxiv INTRODUCTION windows to detect and solve potential bugs when working with parallelized code in .NET Framework 4. Chapter 8, “Thread Pools” — Understand the differences between using tasks and directly requesting work items to run in threads in the thread pool. If you work with a thread pool, you can take advantage of the new improvements and move your code to a task-based programming model. This chapter is about the changes in the CLR thread pool engine introduced in .NET Framework 4 and provides an example of a customized task scheduler. Chapter 9, “Asynchronous Programming Model” — Leverage the advantages of mixing the existing asynchronous programming models with tasks. This chapter provides real- life examples that take advantage of the simplicity of tasks and continuations to perform concurrent asynchronous jobs related to the existing asynchronous programming models. In addition, the chapter teaches one of the most complex topics related to concurrent pro- gramming: the process of updating the User Interface (UI) from diverse tasks and threads. The chapter explains patterns to update the UI in both Windows Forms and Windows Presentation Foundation (WPF) applications. Chapter 10, “Parallel Testing and Tuning” — Leverage the new concurrency profiling features introduced in Visual Studio 2010 Premium and Ultimate editions. It is very impor- tant to learn the common, easy-to-detect problems related to parallelized code with .NET Framework 4. This chapter explains the different techniques used to create and run parallel tests and benchmarks. It also teaches you to refactor an existing application according to the results of each profiling session. Chapter 11, “Vectorization, SIMD Instructions, and Additional Parallel Libraries” — Take advantage of other possibilities offered by modern hardware related to parallelism. .NET Framework 4 does not offer direct support to SIMD or vectorization. However, most modern microprocessors provide these powerful additional instructions. Thus, you can use libraries optimized to take advantage of the performance improvements provided by these instructions. This chapter explains how to integrate Intel Math Kernel Library into task-based programming code using C#. In addition, it explains how to optimize critical sections using Intel Integrated Performance Primitives. Appendix A, “.NET 4 Parallelism Class Diagrams” — This appendix includes diagrams for the classes, interfaces, structures, delegates, enumerations, and exceptions that support par- allelism with the new lightweight concurrency model and the underlying threading model. There are also references to the chapters that explain the contents of these diagrams in more detail. Appendix B, “Concurrent UML Models” — This appendix gives you some examples of how you can use UML models to represent designs and code prepared for concurrency. You can extend the classic models by adding a few simple and standardized visual elements. Appendix C, “Parallel Extensions Extras” — Parallel Extensions Extras is a complementary project that isn’t part of the .NET Framework 4 classes, but was developed by Microsoft as part of the parallel programming samples for .NET Framework 4. This appendix includes diagrams and brief descriptions for the classes and structures that constitute the Parallel Extensions Extras.
  • 31. xxv INTRODUCTION WHAT YOU NEED TO USE THIS BOOK To get the most out of this book, you’ll need Visual Studio 2010 Premium or Ultimate Edition, which includes .NET Framework 4 and the concurrency profiling features. You may use Visual Studio 2010 Standard Edition instead, but the concurrency profiling features aren’t available in this edition. Nor should you use Visual C# 2010 Express Edition, because it doesn’t provide the neces- sary debugging windows to work with task-based programming. In addition, you’ll need at least two physical cores in your developer computer to understand the examples shown in the book. However, if you want to test scalability, at least three physical cores is a better option. Windows 7 and Windows 2008 R2 introduced significant enhancements in their schedulers to improve scalability in multicore and manycore systems. The book is based on applications running on these Windows versions. If you work with previous Windows versions, the results might differ. CONVENTIONS To help you get the most from the text and keep track of what’s happening, we’ve used a number of conventions throughout the book. Notes, tips, hints, tricks, and asides to the current discussion are offset and placed in italics like this. As for styles in the text: ‰ We highlight new terms and important words when we introduce them. ‰ We show keyboard strokes like this: Ctrl+A. ‰ We show file names, URLs, and code within the text like so: persistence.properties. ‰ We present code in two different ways: We use a monofont type with no highlighting for most code examples. We use bold to emphasize code that’s particularly important in the present context. SOURCE CODE As you work through the examples in this book, you may choose either to type in all the code manually or to use the source code files that accompany the book. All of the source code used in this book is available for download at www.wrox.com. Once at the site, simply locate the book’s title
  • 32. xxvi INTRODUCTION (either by using the Search box or by using one of the title lists) and click the Download Code link on the book’s detail page to obtain all the source code for the book. Because many books have similar titles, you may find it easiest to search by ISBN. This book’s ISBN is 978-0-470-49599-5. Once you download the code, just decompress it with your favorite compression tool. Alternately, you can go to the main Wrox code download page at www.wrox.com/dynamic/books/download. aspx to see the code available for this book and all other Wrox books. ERRATA We make every effort to ensure that there are no errors in the text or in the code. However, no one is perfect, and mistakes do occur. If you find an error in one of our books, like a spelling mistake or faulty piece of code, we would be very grateful for your feedback. By sending in errata, you may save another reader hours of frustration, and at the same time, you will be helping us provide even higher-quality information. To find the errata page for this book, go to www.wrox.com and locate the title using the Search box or one of the title lists. Then, on the book details page, click the Book Errata link. On this page, you can view all errata that have been submitted for this book and posted by Wrox editors. A complete book list, including links to each book’s errata, is also available at www.wrox.com/misc-pages/book- list.shtml. If you don’t spot “your” error on the Book Errata page, go to www.wrox.com/contact/techsupport .shtml and complete the form there to send us the error you have found. We’ll check the informa- tion and, if appropriate, post a message to the book’s errata page and fix the problem in subsequent editions of the book. P2P.WROX.COM For author and peer discussion, join the P2P forums at p2p.wrox.com. The forums are a web-based system for you to post messages relating to Wrox books and technologies and interact with other readers and technology users. The forums offer a subscription feature to email you topics of inter- est of your choosing when new posts are made to the forums. Wrox authors, editors, other industry experts, and your fellow readers are present on these forums. At http://p2p.wrox.com, you will find a number of different forums that will help you not only as you read this book, but also as you develop your own applications. To join the forums, just follow these steps:
  • 33. xxvii INTRODUCTION 1. Go to p2p.wrox.com and click the Register link. 2. Read the terms of use and click Agree. 3. Complete the required information to join as well as any optional information you wish to provide and click Submit. 4. You will receive an email with information describing how to verify your account and complete the joining process. You can read messages in the forums without joining P2P, but in order to post your own messages, you must join. Once you join, you can post new messages and respond to messages other users post. You can read messages at any time on the web. If you would like to have new messages from a particular forum emailed to you, click the Subscribe To This Forum icon by the forum name in the forum listing. For more information about how to use the Wrox P2P, be sure to read the P2P FAQs for answers to questions about how the forum software works as well as many common questions specific to P2P and Wrox books. To read the FAQs, click the FAQ link on any P2P page.
  • 35. 1 Task-Based Programming WHAT’S IN THIS CHAPTER? ‰ Working with shared-memory multicore ‰ Understanding the differences between shared-memory multicore and distributed-memory systems ‰ Working with parallel programming and multicore programming in shared-memory architectures ‰ Understanding hardware threads and software threads ‰ Understanding Amdahl’s Law ‰ Considering Gustafson’s Law ‰ Working with lightweight concurrency models ‰ Creating successful task-based designs ‰ Understanding the differences between interleaved concurrency, concurrency, and parallelism ‰ Parallelizing tasks and minimizing critical sections ‰ Understanding rules for parallel programming for multicore architectures ‰ Preparing for NUMA architectures This chapter introduces the new task-based programming that allows you to introduce paral- lelism in applications. Parallelism is essential to exploit modern shared-memory multicore architectures. The chapter describes the new lightweight concurrency models and important concepts related to concurrency and parallelisms. It includes the necessary background infor- mation in order to prepare your mind for the next 10 chapters.
  • 36. 2 x CHAPTER 1 TASK-BASED PROGRAMMING WORKING WITH SHARED-MEMORY MULTICORE In 2005, Herb Sutter published an article in Dr. Dobb’s Journal titled “The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software” (www.gotw.ca/publications/ concurrency-ddj.htm). He talked about the need to start developing software considering concur- rency to fully exploit continuing exponential microprocessors throughput gains. Microprocessor man- ufacturers are adding processing cores instead of increasing their clock frequency. Software developers can no longer rely on the free-lunch performance gains these increases in clock frequency provided. Most machines today have at least a dual-core microprocessor. However, quad-core and octal- core microprocessors, with four and eight cores, respectively, are quite popular on servers, advanced workstations, and even on high-end mobile computers. More cores in a single microprocessor are right around the corner. Modern microprocessors offer new multicore architectures. Thus, it is very important to prepare the software designs and the code to exploit these architectures. The different kinds of applications generated with Visual C# 2010 and .NET Framework 4 run on one or many central processing units (CPUs), the main microproces- sors. Each of these microprocessors can have a different number of cores, capable of executing instructions. You can think of a multicore microprocessor as many interconnected microprocessors in a single package. All the cores have access to the main memory, as illustrated in Figure 1-1. Thus, this architecture is known as shared-memory multicore. Sharing memory in this way can easily lead to a performance bottleneck. Core #0 Core #1 Core #2 Main memory (shared-memory) Core #n FIGURE 1-1
  • 37. Working with Shared-Memory Multicore x 3 Multicore microprocessors have many different complex micro-architectures, designed to offer more parallel-execution capabilities, improve overall throughput, and reduce potential bottlenecks. At the same time, multicore microprocessors try to shrink power consumption and generate less heat. Therefore, many modern microprocessors can increase or reduce the frequency for each core according to their workload, and they can even sleep cores when they are not in use. Windows 7 and Windows Server 2008 R2 support a new feature called Core Parking. When many cores aren’t in use and this feature is active, these operating systems put the remaining cores to sleep. When these cores are necessary, the operating systems wake the sleeping cores. Modern microprocessors work with dynamic frequencies for each of their cores. Because the cores don’t work with a fixed frequency, it is difficult to predict the performance for a sequence of instruc- tions. For example, Intel Turbo Boost Technology increases the frequency of the active cores. The process of increasing the frequency for a core is also known as overclocking. If a single core is under a heavy workload, this technology will allow it to run at higher frequencies when the other cores are idle. If many cores are under heavy workloads, they will run at higher frequencies but not as high as the one achieved by the single core. The microprocessor cannot keep all the cores overclocked a lot of time, because it consumes more power and its temperature increases faster. The average clock frequency for all the cores under heavy workloads is going to be lower than the one achieved for the single core. Therefore, under certain situations, some code can run at higher frequencies than other code, which can make measuring real performance gains a challenge. Differences Between Shared-Memory Multicore and Distributed-Memory Systems Distributed-memory computer systems are composed of many microprocessors with their own private memory, as illustrated in Figure 1-2. Each microprocessor can be in a different computer, with different types of communication channels between them. Examples of communication chan- nels are wired and wireless networks. If a job running in one of the microprocessors requires remote data, it has to communicate with the corresponding remote microprocessor through the communi- cation channel. One of the most popular communications protocols used to program parallel appli- cations to run on distributed-memory computer systems is Message Passing Interface (MPI). It is possible to use MPI to take advantage of shared-memory multicore with C# and .NET Framework. However, MPI’s main focus is to help developing applications run on clusters. Thus, it adds a big overhead that isn’t necessary in shared-memory multicore, where all the cores can access the mem- ory without the need to send messages. Figure 1-3 shows a distributed-memory computer system with three machines. Each machine has a quad-core microprocessor, and a shared-memory architecture for these cores. This way, the private memory for each microprocessor acts as a shared memory for its four cores. A distributed-memory system forces you to think about the distribution of the data, because each message to retrieve remote data can introduce an important latency. Because you can add new machines (nodes) to increase the number of microprocessors for the system, distributed-memory systems can offer great scalability.
  • 38. 4 x CHAPTER 1 TASK-BASED PROGRAMMING Microprocessor #0 Core #0 Private memory for Microprocessor #0 Communication channel between the microprocessors Microprocessor #1 Core #0 Private memory for Microprocessor #1 Microprocessor #2 Core #0 Private memory for Microprocessor #2 FIGURE 1-2 Parallel Programming and Multicore Programming Traditional sequential code, where instructions run one after the other, doesn’t take advantage of multiple cores because the serial instructions run on only one of the available cores. Sequential code written with Visual C# 2010 won’t take advantage of multiple cores if it doesn’t use the new features offered by .NET Framework 4 to split the work into many cores. There isn’t an automatic parallel- ization of existing sequential code. Parallel programming is a form of programming in which the code takes advantage of the paral- lel execution possibilities offered by the underlying hardware. Parallel programming runs many instructions at the same time. As previously explained, there are many different kinds of parallel architectures, and their detailed analysis would require a complete book dedicated to the topic. Multicore programming is a form of programming in which the code takes advantage of the multiple execution cores to run many instructions in parallel. Multicore and multiprocessor compu- ters offer more than one processing core in a single machine. Hence, the goal is to do more in less time by distributing the work to be done in the available cores. Modern microprocessors can execute the same instruction on multiple data, something classified by Michael J. Flynn in his proposed Flynn’s taxonomy in 1966 as Single Instruction, Multiple Data (SIMD). This way, you can take advantage of these vector processors to reduce the time needed to execute certain algorithms.
  • 39. Understanding Hardware Threads and Software Threads x 5 Communication channel between the microprocessors Microprocessor #0 Core #0 Core #1 Core #2 Core #n Private memory for Microprocessor #0 and shared memory for its four cores Microprocessor #1 Core #0 Core #1 Core #2 Core #n Private memory for Microprocessor #1 and shared memory for its four cores Microprocessor #2 Core #0 Core #1 Core #2 Core #n Private memory for Microprocessor #2 and shared memory for its four cores FIGURE 1-3 This book covers two areas of parallel programming in great detail: shared-memory multicore programming and the usage of vector-processing capabilities. The overall goal is to reduce the execution time of the algorithms. The additional processing power enables you to add new features to existing software, as well. UNDERSTANDING HARDWARE THREADS AND SOFTWARE THREADS A multicore microprocessor has more than one physical core — real independent processing units that make it possible to run instructions at the same time, in parallel. In order to take advantage of multiple physical cores, it is necessary to run many processes or to run more than one thread in a single process, creating multithreaded code. However, each physical core can offer more than one hardware thread, also known as a logi- cal core or logical processor. Microprocessors with Intel Hyper-Threading Technology (HT or HTT) offer many architectural states per physical core. For example, many microprocessors with four physical cores with HT duplicate the architectural states per physical core and offer eight hardware threads. This technique is known as simultaneous multithreading (SMT) and it uses the additional architectural states to optimize and increase the parallel execution at the
  • 40. 6 x CHAPTER 1 TASK-BASED PROGRAMMING microprocessor’s instruction level. SMT isn’t restricted to just two hardware threads per physical core; for example, you could have four hardware threads per core. This doesn’t mean that each hardware thread represents a physical core. SMT can offer performance improvements for mul- tithreaded code under certain scenarios. Subsequent chapters provide several examples of these performance improvements. Each running program in Windows is a process. Each process creates and runs one or more threads, known as software threads to differentiate them from the previously explained hardware threads. A process has at least one thread, the main thread. An operating system scheduler shares out the available processing resources fairly between all the processes and threads it has to run. Windows scheduler assigns processing time to each software thread. When Windows scheduler runs on a mul- ticore microprocessor, it has to assign time from a hardware thread, supported by a physical core, to each software thread that needs to run instructions. As an analogy, you can think of each hardware thread as a swim lane and a software thread as a swimmer. Each software thread shares the private unique memory space with its parent process. However it has its own stack, registers, and a private local storage. Windows recognizes each hardware thread as a schedulable logical processor. Each logical proces- sor can run code for a software thread. A process that runs code in multiple software threads can take advantage of hardware threads and physical cores to run instructions in parallel. Figure 1-4 shows software threads running on hardware threads and on physical cores. Windows scheduler can decide to reassign one software thread to another hardware thread to load-balance the work done by each hardware thread. Because there are usually many other software threads waiting for processing time, load balancing will make it possible for these other threads to run their instruc- tions by organizing the available resources. Figure 1-5 shows Windows Task Manager displaying eight hardware threads (logical cores and their workloads). Load balancing refers to the practice of distributing work from software threads among hardware threads so that the workload is fairly shared across all the hardware threads. However, achieving perfect load balance depends on the parallelism within the application, the workload, the number of software threads, the available hardware threads, and the load-balancing policy. Windows Task Manager and Windows Resource Monitor show the CPU usage history graphics for hardware threads. For example, if you have a microproces- sor with four physical cores and eight hardware threads, these tools will display eight independent graphics.
  • 41. Understanding Hardware Threads and Software Threads x 7 Physical Core #0 Physical Core #1 Process with 6 Software Threads Software Thread #2 Main Software Thread (Software Thread #0) Hardware Thread #1 Hardware Thread #2 Hardware Thread #3 Hardware Thread #4 Hardware Thread #5 Hardware Thread #6 Hardware Thread #7 Hardware Thread #0 Software Thread #5 Software Thread #5 Software Thread #4 Software Thread #3 Physical Core #2 Physical Core #3 Main Memory (Shared-Memory) Main Software Thread (Software Thread #0) Software Thread #1 FIGURE 1-4 D ownload from Wow! eBook <www.wowebook.com>
  • 42. 8 x CHAPTER 1 TASK-BASED PROGRAMMING FIGURE 1-5 Windows runs hundreds of software threads by assigning chunks of processing time to each avail- able hardware thread. You can use Windows Resource Monitor to view the number of software threads for a specific process in the Overview tab. The CPU panel displays the image name for each process and the number of associated software threads in the Threads column, as shown in Figure 1-6 where the vlc.exe process has 32 software threads. FIGURE 1-6 Core Parking is a Windows kernel power manager and kernel scheduler technology designed to improve the energy efficiency of multicore systems. It constantly tracks the relative workloads of every hardware thread relative to all the others and can decide to put some of them into sleep mode. Core Parking dynamically scales the number of hardware threads that are in use based on workload. When the workload for one of the hardware threads is lower than a certain threshold value, the Core Parking algorithm will try to reduce the number of hardware threads that are in use by park- ing some of the hardware threads in the system. In order to make this algorithm efficient, the kernel scheduler gives preference to unparked hardware threads when it schedules software threads. The kernel scheduler will try to let the parked hardware threads become idle, and this will allow them to transition into a lower-power idle state.
  • 43. Understanding Hardware Threads and Software Threads x 9 Core Parking tries to intelligently schedule work between threads that are running on multiple hardware threads in the same physical core on systems with microprocessors that include HT. This scheduling decision decreases power consumption. Windows Server 2008 R2 supports the complete Core Parking technology. However, Windows 7 also uses the Core Parking algorithm and infrastructure to balance processor performance between hardware threads with microprocessors that include HT. Figure 1-7 shows Windows Resource Monitor displaying the activity of eight hardware threads, with four of them parked. FIGURE 1-7 Regardless of the number of parked hardware threads, the number of hardware threads returned by .NET Framework 4 functions will be the total number, not just the unparked ones. Core Parking tech- nology doesn’t limit the number of hardware threads available to run software threads in a process. Under certain workloads, a system with eight hardware threads can turn itself into a system with two hardware threads when it is under a light workload, and then increase and spin up reserve hard- ware threads as needed. In some cases, Core Parking can introduce an additional latency to schedule
  • 44. 10 x CHAPTER 1 TASK-BASED PROGRAMMING many software threads that try to run code in parallel. Therefore, it is very important to consider the resultant latency when measuring the parallel performance. UNDERSTANDING AMDAHL’S LAW If you want to take advantage of multiple cores to run more instructions in less time, it is necessary to split the code in parallel sequences. However, most algorithms need to run some sequential code to coordinate the parallel execution. For example, it is necessary to start many pieces in parallel and then collect their results. The code that splits the work in parallel and collects the results could be sequen- tial code that doesn’t take advantage of parallelism. If you concatenate many algorithms like this, the overall percentage of sequential code could increase and the performance benefits achieved may decrease. Gene Amdahl, a renowned computer architect, made observations regarding the maximum performance improvement that can be expected from a computer system when only a fraction of the system is improved. He used these observations to define Amdahl’s Law, which consists of the following formula that tries to predict the theoretical maximum performance improvement (known as speedup) using multiple processors. It can also be applied with parallelized algorithms that are going to run with multicore microprocessors. Maximum speedup (in times) = 1 / ((1 - P) + (P/N)) where: ‰ P is the portion of the code that runs completely in parallel. ‰ N is the number of available execution units (processors or physical cores). According to this formula, if you have an algorithm in which only 50 percent (P = 0.50) of its total work is executed in parallel, the maximum speedup will be 1.33x on a microprocessor with two physical cores. Figure 1-8 illustrates an algorithm with 1,000 units of work split into 500 units of sequential work and 500 units of parallelized work. If the sequential version takes 1,000 seconds to complete, the new version with some parallelized code will take no less than 750 seconds. Maximum speedup (in times) = 1 / ((1 - 0.50) + (0.50 / 2)) = 1.33x The maximum speedup for the same algorithm on a microprocessor with eight physical cores will be a really modest 1.77x. Therefore, the additional physical cores will make the code take no less than 562.5 seconds. Maximum speedup (in times) = 1 / ((1 - 0.50) + (0.50 / 8)) = 1.77x Figure 1-9 shows the maximum speedup for the algorithm according to the number of physical cores, from 1 to 16. As you can see, the speedup isn’t linear, and it wastes processing power as the number of cores increases. Figure 1-10 shows the same information using a new version of the algo- rithm in which 90 percent (P = 0.90) of its total work is executed in parallel. In fact, 90 percent of
  • 45. Understanding Amdahl’s Law x 11 parallelism is a great achievement, but it results in a 6.40x speedup on a microprocessor with 16 physical cores. Maximum speedup (in times) = 1 / ((1 - 0.90) + (0.90 / 16)) = 6.40x Original sequential version Total work: 1,000 units Sequential work: 1,000 units Optimized version Sequential work: 250 units 2 physical cores 8 physical cores 250 units on each physical core 62 or 63 units on each physical core Completely parallelized work: 500 units Sequential work: 250 units Total work: 1,000 units FIGURE 1-8
  • 46. 12 x CHAPTER 1 TASK-BASED PROGRAMMING 16 14 12 10 8 6 4 2 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Number of physical cores Maximum speedup FIGURE 1-9 16 14 12 10 8 6 4 2 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Number of physical cores Maximum speedup FIGURE 1-10
  • 47. Considering Gustafson’s Law x 13 Amdahl’s Law takes into account changes in the number of physical cores, but it doesn’t consider potential new features that you could add to existing applica- tions to take advantage of the additional parallel processing power. For example, you can create new algorithms that take advantage of the additional cores while you run other algorithms in parallel that don’t achieve great performance improvements when they run with more than three cores. You can create designs that consider different parallelism scenarios to reduce the impact of Amdahl’s Law. The applications have to evolve as the hardware offers new capabilities. CONSIDERING GUSTAFSON’S LAW John Gustafson noticed that Amdahl’s Law viewed the algorithms as fixed, while considering the changes in the hardware that runs them. Thus, he suggested a reevaluation of this law in 1988. He considers that speedup should be measured by scaling the problem to the number of processors and not by fixing the problem size. When the parallel-processing possibilities offered by the hardware increase, the problem workload scales. Gustafson’s Law provides the following formula with the focus on the problem size to measure the amount of work that can be performed in a fixed time: Total work (in units) = S + (N × P) where: ‰ S represents the units of work that run with a sequential execution. ‰ P is the size of each unit of work that runs completely in parallel. ‰ N is the number of available execution units (processors or physical cores). You can consider a problem composed of 50 units of work with a sequential execution. The problem can also schedule parallel work in 50 units of work for each available core. If you have a micropro- cessor with two physical cores, the maximum amount of work is going to be 150 units. Total work (in units) = 50 + (2 × 50) = 150 units of work Figure 1-11 illustrates an algorithm with 50 units of work with a sequential execution and a parallel- ized section. The latter scales according to the number of physical cores. This way, the parallelized section can process scalable, parallelizable 50 units of work. The workload in the parallelized section increases when more cores are available. The algorithm can process more data in less time if there are enough additional units of work to process in the parallelized section. The same algorithm can run on a microprocessor with eight physical cores. In this case, it will be capable of processing 450 units of work in the same amount of time required for the previous case: Total work (in units) = 50 + (8 × 50) = 450 units of work
  • 48. 14 x CHAPTER 1 TASK-BASED PROGRAMMING Total work: 150 units Completely parallelized work: 100 units Sequential work: 25 units Sequential work: 25 units 8 physical cores 2 physical cores 50 units on each physical core Total work: 450 units Sequential work: 25 units Completely parallelized work: 400 units Sequential work: 25 units 50 units on each physical core FIGURE 1-11 Figure 1-12 shows the speedup for the algorithm according to the number of physical cores, from 1 to 16. This speedup is possible provided there are enough units of work to process in parallel when the number of cores increases. As you can see, the speedup is better than the results offered by applying Amdahl’s Law. Figure 1-13 shows the total amount of work according to the number of available physical cores, from 1 to 32. Sometimes, the amount of time spent in sequential sections of the program depends on the problem size. In these cases, you can scale the problem size in order to improve the chances of achieving better speedups than the ones calculated by Amdahl’s Law. However, some problems have limits in the volume of data to be processed in parallel that can scale. When this happens, you can add new features to take advantage of the parallel processing power available in modern hardware, or you can work with different designs. Subsequent chapters teach many techniques to prepare algorithms to improve the total work calculated by Gustafson’s Law.
  • 49. Considering Gustafson’s Law x 15 16 14 12 10 8 6 4 2 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Number of physical cores Speedup FIGURE 1-12 1800 1600 1400 1200 1000 800 Total work (In units) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Number of physical cores 15 16 17 18 19 20 21 22 23 24 2526 27 28 29 30 31 32 600 400 200 0 FIGURE 1-13
  • 50. 16 x CHAPTER 1 TASK-BASED PROGRAMMING Figure 1-14 illustrates many algorithms composed of several units of work with a sequential execu- tion and parallelized sections. The parallelized sections scale as the number of available cores increases. The impact of the sequential sections decreases as more scalable parallelized sections run units of work. In this case, it is necessary to calculate the total units of work for both the sequential and parallelized sections and then apply them to the formula to find out the total work with eight physical cores: Total sequential work (in units) = 25 + 150 + 100 + 150 = 425 units of work Total parallel unit of work (in units) = 50 + 200 + 300 = 550 units of work Total work (in units) = 425 + (8 × 550) = 4,825 units of work Sequential work 300 units 300 units 300 units 300 units 300 units 300 units 150 units 100 units 150 units 200 units 200 units 200 units 200 units 200 units 200 units 50 units 50 units 50 units 50 units 50 units 50 units 25 units Completely parallelized work Completely parallelized work Sequential work Completely parallelized work Sequential work Sequential work FIGURE 1-14 A sequential execution would be capable of executing only 975 units of work in the same amount of time: Total work with a sequential execution (in units) = 25 + 50 + 150 + 200 + 100 + 300 + 150 = 975 units of work WORKING WITH LIGHTWEIGHT CONCURRENCY Neither Amdahl’s Law nor Gustafson’s Law takes into account the overhead introduced by paral- lelism. Nor do they consider the existence of patterns that allow the transformation of sequential parts into new algorithms that can take advantage of parallelism. It is very important to reduce the sequential code that has to run in applications to improve the usage of the parallel execution units.
  • 51. Creating Successful Task-Based Designs x 17 In previous .NET Framework versions, if you wanted to run code in parallel in a C# application (a process) you had to create and manage multiple threads (software threads). Therefore, you had to write complex multithreaded code. Splitting algorithms into multiple threads, coordinating the dif- ferent units of code, sharing information between them, and collecting the results are indeed complex programming jobs. As the number of logical cores increases, it becomes even more complex, because you need more threads to achieve better scalability. The multithreading model wasn’t designed to help developers tackle the multicore revolution. In fact, creating a new thread requires a lot of processor instructions and can introduce a lot of overhead for each algorithm that has to be split into parallelized threads. Many of the most useful structures and classes were not designed to be accessed by different threads, and, therefore, a lot of code had to be added to make this possible. This additional code distracts the developer from the main goal: achiev- ing a performance improvement through parallel execution. Because this multithreading model is too complex to handle the multicore revolution, it is known as heavyweight concurrency. It adds an important overhead. It requires adding too many lines of code to handle potential problems because of its lack of support of multithreaded access at the framework level, and it makes the code complex to understand. The aforementioned problems associated with the multithreading model offered by previous .NET Framework versions and the increasing number of logical cores offered in modern microprocessors motivated the creation of new models to allow creating parallelized sections of code. The new model is known as lightweight concurrency, because it reduces the overall overhead needed to create and execute code in different logical cores. It doesn’t mean that it eliminates the overhead introduced by parallelism, but the model is prepared to work with modern multicore microprocessors. The heavyweight concur- rency model was born in the multiprocessor era, when a computer could have many physical micropro- cessors with one physical core in each. The lightweight concurrency model takes into account the new microarchitectures in which many logical cores are supported by some physical cores. The lightweight concurrency model is not just about scheduling work in different logical cores. It also adds support of multithreaded access at the framework level, and it makes the code much simpler to understand. Most modern programming languages are moving to the lightweight concurrency model. Luckily, .NET Framework 4 is part of this transition. Thus, all the managed languages that can generate .NET applications can take advantage of the new model. CREATING SUCCESSFUL TASK-BASED DESIGNS Sometimes, you have to optimize an existing solution to take advantage of parallelism. In these cases, you have to understand an existing sequential design or a parallelized algorithm that offers a reduced scalability, and then you have to refactor it to achieve a performance improvement without introducing problems or generating different results. You can take a small part or the whole problem and create a task-based design, and then you can introduce parallelism. The same technique can be applied when you have to design a new solution. You can create successful task-based designs by following these steps: 1. Split each problem into many subproblems and forget about sequential execution.
  • 52. 18 x CHAPTER 1 TASK-BASED PROGRAMMING 2. Think about each subproblem as any of the following: Data that can be processed in parallel — Decompose data to achieve parallelism. Data flows that require many tasks and that could be processed with some kind of complex parallelism — Decompose data and tasks to achieve parallelism. Tasks that can run in parallel — Decompose tasks to achieve parallelism. 3. Organize your design to express parallelism. 4. Determine the need for tasks to chain the different subproblems. Try to avoid dependencies as much as possible. 5. Design with concurrency and potential parallelism in mind. 6. Analyze the execution plan for the parallelized problem considering current multicore micro- processors and future architectures. Prepare your design for higher scalability. 7. Minimize critical sections as much as possible. 8. Implement parallelism using task-based programming whenever possible. 9. Tune and iterate. The aforementioned steps don’t mean that all the subproblems are going to be parallelized tasks running in different threads. The design has to consider the possibility of parallelism and then, when it is time to code, you can decide the best option according to the performance and scal- ability goals. It is very important to think in parallel and split the work to be done into tasks. This way, you will be able to parallelize your code as needed. If you have a design prepared for a classic sequential execution, it is going to take a great effort to parallelize it by using task-based program- ming techniques. You can combine task-based designs with object-oriented designs. In fact, you can use object-oriented code to encapsulate parallelism and create parallelized objects and components. Designing With Concurrency in Mind When you design code to take advantage of multiple cores, it is very important to stop thinking that the code inside a C# application is running alone. C# is prepared for concurrent code, mean- ing that many pieces of code can run inside the same process simultaneously or with an interleaved execution. The same class method can be executed in concurrent code. If this method saves a state in a static variable and then uses this saved state later, many concurrent executions could yield unex- pected and unpredictable results. As previously explained, parallel programming for multicore microprocessors works with the shared-memory model. The data resides in the same shared memory, which could lead to unex- pected results if the design doesn’t consider concurrency.
  • 53. Creating Successful Task-Based Designs x 19 It is a good practice to prepare each class and method to be able to run concurrently, without side effects. If you have classes, methods, or components that weren’t designed with concurrency in mind, you would have to test their designs before using them in parallelized code. Each subproblem detected in the design process should be capable of running while the other subproblems are being executed concurrently. If you think that it is necessary to restrict concurrent code when a certain subproblem runs because it uses legacy classes, methods, or components, it should be made clear in the design documents. Once you begin working with parallelized code, it is very easy to incorporate other existing classes, methods, and components that create undesired side effects because they weren’t designed for concurrent execution. Understanding the Differences between Interleaved Concurrency, Concurrency, and Parallelism Figure 1-15 illustrates the differences between interleaved concurrency and concurrency when there are two software threads and each one executes four instructions. The interleaved concurrency sce- nario executes one instruction for each thread, interleaving them, but the concurrency scenario runs two instructions in parallel, at the same time. The design has to be prepared for both scenarios. Concurrency requires physically simultaneous processing to happen. Parallelism entails partitioning work to be done, running processing on those pieces concurrently, and joining the results. Parallelizing a problem generates concurrency. Parallelized code can run in many different concurrency and interleaved concurrency scenarios, even when it is executed in the same hardware configuration. Thus, one of the great challenges of a paral- lel design is to make sure that its execution with different possible valid orders and interleaves will lead to the correct result, otherwise known as correctness. If you need a specific order or certain parts of the code don’t have to run together, it is necessary to make sure that these parts don’t run concurrently. You cannot assume that they don’t run concurrently because you run it many times and it produces the expected results. When you design for concurrency and parallelism, you have to make sure that you consider correctness. In the next chapter, you will learn more about the differences between concurrency and parallelism by looking at various code samples. Parallelizing Tasks Visual C# 2010 and .NET Framework 4 make it easy to transform task-based designs into parallel- ized code. However, it is very important to understand that the parallelized code requires specific testing and tuning procedures in order to achieve the expected goals. You will learn about them through the rest of the book. When you parallelize the tasks, the overhead introduced by parallelism can have an important impact and may require testing different alternatives. As previously explained, modern multicore microprocessors are extremely complex, and it is necessary to test the results offered by different parallelizing techniques until you can make your choice. In fact, the same happens with sequential
  • 54. 20 x CHAPTER 1 TASK-BASED PROGRAMMING code, but the difference is that you already know that a foreach loop is slower than a for loop. While parallelizing tasks, a parallelized version of a for loop can offer many different performance results according to certain parameters that determine the way the parallelized tasks are executed. Once you experience these scenarios, you will be able to consider them when you have to write the code for similar problems and analogous task-based designs. Interleaved concurrency Concurrency with a physical simultaneous processing Time Thread # Instruction # Time Thread # Instruction # Thread # Instruction # t0 1 0 0 0 t1 1 1 0 1 t2 1 2 0 2 t3 1 3 0 3 t0 t1 t2 t3 t4 t5 t6 t7 0 0 1 1 2 2 3 3 0 1 0 1 0 1 0 1 FIGURE 1-15
  • 55. Exploring the Variety of Random Documents with Different Content
  • 56. Priessnitz says it is difficult to prescribe for these complaints at a distance; and that except in young people, or where the disease is in its infancy, a cure is seldom effected. It is however always safe to adopt the following treatment, which will refresh and strengthen the patient. Three rubbing-sheets, at intervals during the day. One or two foot-baths, but NO sitz-baths without advice. If the feet swell, continue the treatment, all the same, rub with wet hands, and bandage the legs, from the ankle to the knee, this will reduce the swelling. Spine complaint and general debility.—A lady. Morning, packing-sheet until warm, followed by plunge-bath one minute; noon, douche three minutes, return home and then take a rubbing-sheet and sitz-bath, twenty minutes; afternoon, as in the morning. Rubbed the back and nape of the neck with wet hands, twice a day. Patient staid all the winter; during which time symptoms were combated as they arose, she gained strength and flesh. Spinal affection.—A young lady, after submitting to all sorts of medical treatment for three or four years, came to Gräfenberg. She was clothed in flannel, suffered greatly from indigestion, constipation, and languid circulation, feet always cold, walking a short distance brought on pain in the back. Second day after her arrival, Priessnitz ordered,— “Put aside all flannel, go as lightly clad as possible, keep bed-room window open day and night, and sleep with only a single sheet as a covering, leave off stockings and run bare-footed on the wet grass near the house, or on the cold stones of the passage for half an hour before breakfast in the morning.
  • 57. “Eat black bread and drink sour milk, lie on the stomach and have the spine rubbed several times a day with wet hands.” First four days, patient had cold feet in and after the packing-sheet, this was then followed by tepid, then cold, and back to tepid-bath, feet well rubbed, previous to going into packing-sheet, and last thing at night; by this treatment head-ache was relieved and the feet became warmer. In ten days began the douche for one minute; digestion improved; no longer constipated. Bandages always round the body, and to feet and legs at night. Patient was at Gräfenberg nine months, during which time the treatment was often changed to meet circumstances. One time, suppressed catamenia was relieved in two days by sixteen rubbing- sheets a day. At another, patient met with an accident in the leg; Priessnitz to keep this to the surface, ordered more water to be drunk. This patient left Gräfenberg in excellent health, though not entirely cured of the affection of the spine, that being out of its perpendicular position. Pain in the Shoulder and Chest.—A lady in the treatment complained of pain in the shoulder and left breast, and down the side. Ordered, when in sitz-bath the upper part of the body to be well rubbed. Body bandage to be more wrung out than usual, and extra covering over it. Pain in the side, Chronic cold in the head.—A German officer aged 50, afflicted as above, and with continued stoppage in the nose, and frequent head-aches, was told by his medical man that he had no chance of being cured, was completely relieved at Gräfenberg, in three or four months.
  • 58. Packing-sheets and tepid baths twice a day. Rubbing-sheet and sitz- baths were resorted to for a short time, the cold bath substituted for the tepid bath, and to this treatment the douche was added. Weak Chest and Worms.—A child three years old. Wash with tepid water, 12° once, and after some time twice a day. Wear body bandage always, and drink water. Pain in the Chest.—A gentleman had pain in his chest, like the hurt from a blow, about the size of a crown-piece. Ordered sixteen rubbing-sheets a day, four at each time. LXXI.—Constipation. This complaint is always relieved, and if sufficient time is devoted to the treatment, finally overcome by Hydropathy; space forbids my going into details, or numerous cases might be given in proof of this assertion. The reader’s attention may however be called to the letter addressed to a newspaper, and signed by upwards of one hundred patients, giving the case of the son of Prince Leichtenstein, who was cured in a few days of Constipation, which had endured twenty-eight days in defiance of all medical aid. To effect a permanent cure, the treatment must be persevered in for a long time, very often a twelvemonth. In a recent case. Rubbing-sheets until feverish heat ceases: sometimes four or three suffice; at others the number must be increased to sixteen or twenty, to be immediately followed by a clyster. Then take a walk, and on returning, a sitz-bath fifteen to twenty minutes, the abdomen to be well rubbed the whole time. Body bandage to be worn always and often changed. This treatment to be resorted to twice a day. Great exercise to be used, and cold light food to be partaken of.
  • 59. A delicate lady who had suffered from this complaint for upwards of twenty years, was relieved in a fortnight, and had no return of it during her stay at Gräfenberg. Her principal treatment was:— Packing-sheet and bath twice a day. Rubbing-sheet and sitz-bath at noon. A second case, which came under my observation, was that of a Russian, who for many years had only been relieved by medicine or enemas. He went to an establishment at Moscow for six months, where he derived great benefit, though he still used enemas. At Gräfenberg he abandoned the latter, his bowels were relaxed and have continued so ever since. LXXII.—Indigestion. Foul tongue and pain at the pit of the stomach; a lady having tried all other remedies, was ordered the following, which answered admirably. Three cold sitz-baths a day, for an hour each time, rubbing the abdomen the whole time, eat nothing but brown bread and drink sour milk during three days. Loss of Appetite, Foul Tongue, etc.—Patient had foul tongue, and loss of appetite. Morning.—Sweating and tepid bath, stomach to be well rubbed in the bath. Sitz-bath thirty minutes in the afternoon. It is very essential to drink abundantly of water, and take great exercise. A child five years old. Pale, foul tongue, loss of appetite, thirsty and awaking with screams. Ablution in the morning, and three tepid sitz- baths daily four minutes each; chest, back, and abdomen to [be]
  • 60. well rubbed all the time; waist bandage day night. Drink as much water as possible. Cured in three months. LXXIII.—Stomach Complaint. Patient’s stomach deranged, food used to return to his mouth: difficult of cure. His second visit to Gräfenberg, cured in nine months. Packing-sheets and rubbing-sheets. Noon, douche, rubbing- sheet and sitz-bath; afternoon, packing-sheet and bath. LXXIV.—Throwing Food off the Stomach. Morning, rubbing-sheet and sitz-bath fifteen minutes. Noon, the same repeated. Afternoon, sitz-bath. A gentleman of my acquaintance pursued three or four months’ treatment for this complaint, and left Gräfenberg without being cured. LXXV.—Heartburn. Drink large quantit[i]es of water fasting, rub the part with wet hands and wear a large bandage, changed often, round the waist. If this does not effect a cure, take a rubbing-sheet or two and a tepid sitz- bath twice a day. Nausea and sickness are to be treated in the same manner; if, however, the latter become chronic, then packing-sheets, tepid baths, and sitz-baths must be resorted to. The diet should be brown bread and milk only. The milk should be boiled, if it otherwise disagrees with the patient. LXXVI.—Sea Sickness. To avoid sea-sickness or relieve it. The traveller should lay on his back, and place a large wet towel on his abdomen, changing it when
  • 61. dry. After a sea voyage take a few rubbing-sheets and sitz-baths. Wear a waist bandage, and if constipated resort to cold water clysters. LXXVII.—Palpitation of the Heart. Many rubbing-sheets; rub the whole, side for a long time and often. Large bandage. Two sitz-baths a day, fifteen minutes each; rubbing the afflicted side the whole time. A lady afflicted as above was relieved in ten minutes by the rubbing-sheets, and dabbling her feet well in cold water. LXXVIII.—Want of Sleep. Before going to bed, take a shallow foot-bath (only to cover the soles of the foot) for seven to ten minutes, rubbing the feet to above the ankles all the time, then walk about the room bare-footed until the feet are quite warm. A lady, in the treatment, complained of want of sleep. Two packing-sheets in the afternoon, the first changed as soon as hot, followed by tepid bath. Two foot-baths for one hour each, the water only to cover the soles of the feet. Feet to be well rubbed the whole time. When the servant is tired of rubbing, patient should walk about the room with bare feet for a few minutes and then resume the foot-bath. LXXIX.—Languid Circulation. I attended many cases of this kind with Mr. Priessnitz, where the languid circulation arose from using the head more than the body. In a general way he began with rubbing-sheets in the morning and afternoon for a few days, and then in the morning packing-sheet until warm, and tepid bath, cold bath, and back to tepid bath. Noon,
  • 62. rubbing-sheet and tepid sitz-bath fifteen minutes; afternoon, packing-sheet and tepid baths as in the morning; or a rubbing-sheet. Bandaged always. LXXX.—Ring Worm. A boy aged seven years had ring worm over the eye and behind his knees. Cured in six weeks. Two packing-sheets and tepid baths daily. Bandage to the knees. Child could not endure the douche. LXXXI.—Hands Frost-bitten or Suffering from a Boil. Rub the hands well with tepid water, and particularly the wrist. Put the elbow into cold water for twenty minutes, three times a day. Bandage the whole arm from the arm-pits down to the wrist. LXXXII.—Weak Eyes and Eruption on the Head. A child two years old had weak eyes, from which there was a constant discharge and an eruption on the face and head; it was treated as follow:— Packing-sheet one hour and sometimes longer, followed by tepid bath. Large bandage from hips to arm-pits night and day. Dabbed the face often with cold water and bandaged the head at night. In three weeks eyes quite well and the eruption diminished. LXXXIII.—Weak Ankles. If an infant, ablution every morning and bandage the ankles night and day. If an older person, ablution and foot-baths twenty minutes. Morning and afternoon, bandage always.
  • 63. LXXXIV.—Treatment of Infants. Immediately after birth bathe the infant in warm water 82°, put a wet bandage on navel, bound on with a dry one, change it morning and evening only. Continue this until the navel is healed. The temperature of the bath to be reduced two degrees every fortnight, until 68°, which is to be used until child can run alone. It may be washed with cold water at three months of age. If an Infant is uneasy or restless and cries.—Put on a body bandage; if this is not sufficient, give it an extra tepid-bath. The child of an Hungarian commissioner was born weak and sickly, with great difficulty in breathing. The physicians treated the mother to improve the milk, when the child refused the breast. From three days old it was spoon-fed. On the fifth or sixth day, the father put the child into a packing-sheet until it was warm, when he changed it, and then applied the tepid-bath. After four day’s treatment a lump appeared on the chest, which increased until it became as large as a man’s fist. On the eighth day it broke, and half a tumbler of matter was discharged. From this moment the child gradually improved and is now the healthiest of his children. Child-teething, Pain in the Head, and Diarrhœa.—Tepid bath for about five minutes three times a day. Two head-baths from ten to fifteen minutes each, and one clyster. A body bandage, and change it often. LXXXV.—Epilepsy. This complaint in a general way is not to be cured by Hydropathy; but Priessnitz thinks persons subject to it should use cold baths, and
  • 64. cold water as a beverage. I know a young man who was six months at Gräfenberg, it is now twelve-months since, and as he has not had an attack, he considers himself cured. LXXXVI.—Hypochondria and Hysteria. A disarrangement of the system, and inaction of the abdomen, cause much uneasiness and discontent. This disease being moral as well as physical, requires pure air, scenery, society, and a complete change in the manner of living. What is so calculated to combat this complaint as Hydropathy? A patient became hypochondriac, in consequence of chronic derangement of bowels, struck with rush of blood to the head, face became crimson, lost speech and consciousness, had convulsions and spasmodic movement of the arms. First operation was to put him into a cold bath, and use strong friction for an hour. He was put into a packing-sheet, in which he became delirious; he was then rubbed by four men in a tepid bath, 64°. He was still unconscious and yet winced on being pinched; water thrown on his head caused a slight cry; great heat on the head. On ceasing the cold affusion, pulse though oppressed began to be felt—eyes fixed—conjunctiva inflamed. Friction continued two hours, then ceased for one hour and a half, and begun again: in an hour spasms ceased, eyes began to move, without seeing. Patient apparently exhausted, pulse gained its power, though still often intermittent, upper part of the body hot, lower extremities could not be warmed all night, consciousness had not returned in the morning, pulse better, but sleep interrupted,— patient groaning. All night wet bandage applied to the head. At 6 o’clock next morning, sweating process, perspiration preceded consciousness, up to which moment patient was insensible to all that had occurred. After half an hour’s sweating, he was well rubbed in tepid bath 66°, and put to bed, when he slept. On awaking he partook of bread and milk.
  • 65. At 2 o’clock p.m., awoke covered with perspiration, and from that time until next morning, slept at intervals, pulse regular, talked calmly and rationally, bowels in a normal state. In the morning, packing-sheet; and later, sweating process; both followed by tepid bath 64°—temperature of the body still high. After good night’s rest, appetite returned, and so much better as to renew the treatment to effect a cure of that which brought him to Gräfenberg. LXXXVII.—Fœtid Perspiration of the Feet. This is relieved by foot-baths, and wearing a bandage on the feet at night; but it cannot be cured without the sweating process. LXXXVIII.—Stricture. Sweating and tepid bath, and cold sitz-baths, are generally resorted to in this complaint. If cold water is found too severe, tepid is used for a time; a bandage is always applied to parts affected. For stoppage of the water, three to six rubbing-sheets; if they fail, resort to sweating process until water comes, then a tepid bath, or rubbing-sheet. Medical men, to effect this object, put the patient first into a warm bath, and then bleed him until he faints: by these means, the prostate gland becomes relaxed, and water flows; or water is passed by the use of catheters, which at Gräfenberg are always dispensed with. LXXXIX.—Inflammation of the Kidneys And Urethra. The treatment must be regulated by circumstances: sometimes sweating, at other times the packing-sheet, tepid bath, and
  • 66. bandage. XC.—Hydrocephalus. A child one year and a-half old had water on the brain, and a large protuberance in the middle of the forehead. Ordered, a tepid bath morning and evening; a rubbing-sheet after an hour’s sleep at noon, and repeated before going to bed at night. Drank water only at meals, and then but little. Bandage from arm-pits down to the knees; was much in the open air. After twelve months, the protuberance went down, leaving a ridge like a pigeon’s breast down the centre; shape of head completely changed, and the boy was perfectly well. XCI.—Syphilis. This complaint always succumbs to the treatment; and a cure effected by it leaves none of those lamentable consequences which attend the exhibition of drugs. By Hydropathic means, the virus is completely thrown out of the system through the pores; whilst the administration of mercury is attended with secondary symptoms, which are more fatal than the disease itself. If taken in time, secondary symptoms are also cured at Gräfenberg. It frequently happens, that patients treated for another complaint, find syphilis return, though they imagined themselves cured of it years before. Recent cases of syphilis in otherwise healthy persons, are generally cured in less than two months; but the cure of secondary symptoms is a work of time. There are many sufferers from this undermining malady, who have been at Gräfenberg one, two, and even three years. In health, they, are much improved; but the malady is too deeply seated to be eradicated. One gentleman, when I was there, was refused admittance; he died in a few days, when it was found that mercury had eaten part of his wind-pipe away—a result that never could have been brought about by water. The following is another deplorable case, the result of bad treatment:—Patient aged
  • 67. thirty-five, tall, thin, and bent when walking; supports his head by pressing his hands on each side of it; part of the cranium destroyed. The brain covered over by a skin; the parietal bones destroyed, and thick pus exudes between the skin and bone, and smells horribly. Inside of the left eye is an ulceration with raised borders, which allows a portion of the orbital arch to be seen surrounded with pus; pulse weak and irregular; constant pain. Treated for secondary symptoms, with mercury in 1841; came to Gräfenberg with three ulcers the size of a shilling on his forehead, with burning pains. Packing-sheets and tepid baths morning and evening, with other intermediate treatment. This case is introduced to show the sort of cases Mr. Priessnitz will undertake: of course, a cure will require a considerable time. XCII.—Chancre. Case of a very strong young man:— For five days—sweating (after perspiration broke out) morning, one hour; afternoon, half an hour; then tepid bath, followed by cold bath and back to tepid. After five days—from sweating went into plunging cold bath; in another week, douched from two to five minutes at eleven o’clock; bandage round the body and on the sores, which were bathed and had water thrown on them frequently; wore suspending bandages; eat sparingly; no meat or butter, and took but little exercise. Perfectly cured in six weeks. XCIII.—Gonorrhœa and Chancres. Sweating, followed by bath in the morning; douche at eleven; at twelve, rubbing-sheet and sitz-bath; afternoon, packing-sheet and bath; chancres increased to the size of a sixpence then, and in two days cicatrised. Patient cured in twenty-five days. Gonorrhœa, &c.—Packing-sheet, tepid bath, and sitz-baths were the means used. The complaint continuing, Priessnitz supposed it arose
  • 68. from debility of the parts, and ordered:— Six sitz-baths of ten minutes, allowing five minutes to elapse between each, twice a-day; packing-sheets to be changed as soon as warm, followed by cold bath. A young man, immediately on discovering this complaint, who took sitz-baths as above described, injected cold water into the urethra, bandaged the parts and drank plentifully of cold water and lived low; was cured in two days. Another person was subject to involuntary emissions, by which his strength was wasting away. In a month after he began the cure, he found an old gonorrhœa return (which had evidently been driven into the system and was the cause of his malady); he was now treated for this and restored to perfect health. A Russian officer, declared cured of chancre three years before, found the complaint return, when he was again treated by mercury. His throat continued to trouble him, his voice was husky, and piles began to make their appearance. After pursuing the Water-cure for a short time, as described in a former case, he had a crisis in his foot, and diarrhœa for a fortnight, when he passed a considerable quantity of blood. After this, the piles disappeared entirely, and his voice became sound and clear. It should be observed that he sweated alternate mornings only; the other mornings, packing- sheets and bath. A young man aged 23, attacked with secondary symptoms: sore throat, etc., was ordered three packing-sheets and cold baths a-day; rubbing-sheet and sitz-bath. I knew another strong young man suffering under secondary symptoms, so that he could hardly walk with the use of a stick; he
  • 69. went to Gräfenberg, staid there two months, and returned to England the picture of health. As there are always at Gräfenberg a large number of individuals labouring under these complaints, cases of cure might be adduced ad infinitum: suffice it to say, that hydropathy in their cure is omnipotent. Buboes and chancres, when taken in their infancy, are eradicated from the system in a few weeks, sometimes days, without the debilitating effects attendant upon other deceitful remedies. XCIV.—Scrofula and Vaccination. Priessnitz, when asked what he conceived to be the cause of such an increase of scrofula as is said to have taken place of late years, said, he attributed it to vaccination, syphilis and drugs. When vaccination is performed without producing its desired effect, the virus remains in the system, and when it proceeds favourably, it is a question if it is ever thoroughly ejected. Every practitioner knows the difficulty that exists of finding children from which to take matter where no taint is in the blood. The child subjected to vaccination is not only exposed to the sins of his own forefathers, but also to those of the stranger. The consequences attendant upon syphilis, and the evil results of mineral poisons, are such as to lead us to believe that Priessnitz’ opinion is not without foundation. I am doubtful whether scrofula is ever cured,7 though whilst at Gräfenberg I saw many obstinate cases relieved. Children who arrived there perfect cripples, were enabled to use their limbs like other people. I think I may in great truth say, that in all cases the enemy received a check, and the general health of the patient was improved. A patient states, that previous to inoculation his family were well; but since that operation they have been scrofulous. He came to Gräfenberg some years ago from Dartres, when Priessnitz told him
  • 70. to go home, give up all beverages but water, use cold baths daily, and he would be well; though incredulous, he followed the advice, and in two years was perfectly cured. For scrofula, the whole treatment must be persevered in for a long time. XCV.—Piles. Piles are caused by an accumulation of blood in the vessels which merge into the large intestines; they either discharge blood, or are confined to a swelling of the veins, in otherwise healthy subjects. Hydropathy effects a radical cure of this complaint, whilst medical remedies are only temporary, and often lead to serious consequences. Treatment.—Morning, three rubbing-sheets and sitz-bath, twenty minutes; noon, the same; afternoon, the same, and an additional sitz-bath, making four sitz-baths during the day. At night, a rubbing- sheet but no sitz-bath, as it is too late to walk after it. Body bandage; much water to be drunk; douche four to eight minutes in the middle of the day, if possible. Out of the general treatment, persons troubled with piles may take sitz-baths and wear a bandage on the part affected. A patient having piles and sore eyes, was advised neither to take sitz-baths or eye-baths. When Priessnitz was asked the reason, he said, “Because you have too much bad matter in your system, which I am afraid of attracting to those parts.” In a common attack of piles, two or three sitz-baths a-day, fifteen minutes each, and wearing a bandage upon the part at night, will afford relief. Persons subject to piles should especially avoid all heating and stimulating drinks.
  • 71. XCVI.—Rupture. I knew of a case of double rupture, in an officer 34 years of age, which was perfectly cured at Gräfenberg in three years. Another case of single rupture was cured in nine months, and a recent one cured in four months. There can be no doubt of the complete omnipotence of Hydropathy over this malady; its cure is only a matter of time. It is difficult to lay down any prescribed treatment, as the chief aim of the practitioner must be to bring his patient into fine health. All organic action is contraction; all strength depends upon the power of the different parts of the body to contract, and nothing will aid the operation so much as the different appliances here made use of. As a rule, I observed that when rupture exudes, the sweating process should be resorted to; when perspiration has broken out, gently rub the part with the hand until the rupture is gone in again. Bandages are worn continually. XCVII.—Chilblains. Rub the feet or hands affected for a quarter of an hour in tepid water three times a-day, and bandage the leg from ankle to knee if in the feet. If in the hand, the arm from wrist to elbow. XCVIII.—Cold Feet. Take a shallow foot-bath, cold, one inch deep, before going to bed, for fifteen minutes; let the feet be well rubbed the whole time, then walk about the room bare-footed for half an hour, so that re-action may take place, or they will be colder than before. XCIX.—Eruption, Scabs, and Sores on the Arms.
  • 72. A child had tried sulphur bandages and all other conceivable means: — Morning, noon, and afternoon, packing-sheet and tepid-bath; the latter after a few days changed to the cold-bath; bandages night and day; cure effected in a few weeks. C.—Consumption. Until the age of fifteen or sixteen Priessnitz conceives this complaint to be always curable. Very often when parties are supposed to be consumptive, they are not so. A young lady arrived at Gräfenberg during my stay there. I thought she had delayed it too long; she appeared in the last stage of consumption. Priessnitz however took the case—and, principally with rubbing-sheets, administered three times a-day, effected an extraordinary cure in two months. I saw this lady afterwards at Florence, and was quite surprised to see what an extremely fine woman she had become. There was also a young lady suffering under the following symptoms:—great debility, very thin, weak eyes, little or no appetite, and a short cough, which would awaken her about four o’clock in the morning, and trouble her the whole day. She was considered by M.D.’s as consumptive. Priessnitz took a different view of the case, and as she was cured in two months he was right. Her treatment was as follows:— Morning, packing-sheet and plunge bath, the tepid-bath having been used only for a short time; at ten o’clock, douche; at eleven, rubbing-sheet and eye-bath; at five, packing-sheet and bath; chest, waist, and forehead bandaged every night; waist bandaged always. Consumption of the Nerves.—A gentleman aged 30, came to Gräfenberg in a most deplorable state, supported on one side by his wife, on the other by his servant. Second night he was taken alarmingly ill, with a fever and a stoppage in his bowels. He was too
  • 73. weak for a packing-sheet or tepid-bath, therefore twelve rubbing- sheets were administered within three hours; and two head-baths during the intermediate times. When a change for the better took place, enemas were applied and relief afforded. The next day patient was out of doors. I left Gräfenberg about this time, therefore do not know if he recovered. Spitting Blood.—A young lady was subjected to spitting blood, pain at the chest, and general debility. Priessnitz doubted if the lungs were affected, and tried packing-sheet and tepid-bath, which patient was found too weak to support. Then rubbing-sheets twice a day; patient still too weak. Then rubbing-sheet, and tepid sitz-bath ten minutes. Feverish excitement and loss of appetite came on. Back of head put into cold water for quarter of an hour; to be repeated several times a day. Bandage at all times down the middle of the breast and round the waist. When spitting of blood came on, then cold foot-baths were resorted to. Patient tried the treatment for a month, but was not much improved by it. On leaving, Priessnitz advised her to spend the winter in Italy, to eat nothing but bread and grapes, and to use cold ablutions. CI.—Insanity. This disease, Priessnitz says is curable, when it proceeds from bodily suffering or disease; but when caused by mental suffering or misfortune, is generally incurable. I witnessed the treatment of a case of aberration of mind at Gräfenberg; the patient was put into a tepid-bath, held there, and rubbed for nine hours and a half; he was then put to bed, and next morning awoke perfectly composed. Hydrophobia.—Dr. Short in 1656, published a work, in which he stated, that with cold water, he had cured the bite of mad dogs and dropsy. Priessnitz says he never treated the human subject for this complaint, but that he had cured a dog, by tying him up and
  • 74. throwing a large number of pails of water over him. At first it caused him to shiver a great deal, proving the absence of fever to any extent. When dry the aspersion was repeated; the shivering diminished at each successive aspersion, until it was entirely allayed. If, on throwing a dog, thus treated, bread, and he will eat it, it is a sign he is cured. Dr. Sully, of Wivelscombe, in a work published some years ago, states, that he dropped water constantly on the wounded part, and that it invariably acted as a preventive. My impression is, that hydropathy is adapted to the cure of this complaint. CII.—Cholera. Spasmodic or pestilential cholera first appeared in England in 1831, and in France in 1832; great difference of opinion exists as to its cause, and hardly two practitioners agree as to the best way to effect a cure. Some persons think, as many would get well without medical aid as with it; and this conjecture is supported by what took place on its visitation in Dublin. The numbers attacked were so great, that for the humble class, large tents were erected outside the city, and the medical men were so harassed by their own connexions within it, that the poor were left very much to fate. On comparing notes of the mortality that took place, it was found, that the number of deaths of those who received medical aid, and those who were deprived of it, were about equal. Pages might be employed in enumerating instances related, in which the cholera was cured by cold water, though administered without reference to any hydropathic rules. In 1832, Cholera made great ravages in Silesia, when numbers at Freywaldau and the neighbourhood, fell victims. Priessnitz’s patients did not escape, though they avoided its fatal consequences. A friend of mine, who was at Gräfenberg at the time, assures me that in cholera, Priessnitz never lost a case, though seventeen of his patients, and many persons in the neighbourhood, were treated by him. My landlord at Freywaldau, confirmed the last of these statements, and said that his daughter fell a victim, who, he
  • 75. felt persuaded, would have recovered, had she been treated with water instead of drugs. To ward off this disease, and place the system, if attacked, in the best condition to resist it, we ask the dispassionate reader, are not hydropathic rules in accordance with reason and common sense? There are three different stages in cholera; the first is that of a common diarrhœa, accompanied with oppression of the chest, anxiety, and collapse of the face; if neglected, it assumes a more serious form, the pulse becomes weak, and there is a difficulty of respiration. The second stage is ushered in by giddiness, great depression of pulse and of the vital energies, with spasms, accompanied by purging and vomitings. In the third stage, the patient is suddenly laid prostrate, serous fluid, in large quantities, is discharged from the bowels and stomach, with cramps and spasms, hardly any pulse, and difficult respiration. Under ordinary treatment, this frequently terminates life in a few hours. To those who have witnessed the wonderful results of the Water- cure treatment in cholic, diarrhœa, &c., it must be evident, that in the primary stages of this malady, the treatment resorted to in those complaints, would be perfectly effectual; and that cholera, in its worst and most fearful form, is to be successfully combated by no other than hydropathic means. If, after visiting a contagious case, Mr. Priessnitz feels at all uncomfortable, he takes a packing-sheet and tepid-bath. Asiatic Cholera.—On the first appearance of Cholera symptoms, which are generally those of languor and chilliness, do not wait for a development, but apply most vigorously a rubbing-sheet; then dry the body, and administer a clyster of cold water. In two or three minutes repeat the rubbing-sheet and clyster, wait five minutes and repeat the same a third time. Then a cold sitz-bath, letting two
  • 76. attendants rub the patient with hands dipped in water, particularly on the abdomen, the whole time; water should be drunk whilst in the sitz-bath, until patient vomits; when cramps in the stomach and vomiting have subsided, place a large bandage round the body, and put him to bed well covered up. After sleeping, apply a tepid-bath with friction for some time. If not cured, renew the whole operation. If, after the sitz-bath, cholera appears on the advance, warm a blanket, and pack the patient as in the sweating process; if he remains therein several hours, and the symptoms do not decrease, renew the whole proceedings, and again try to produce perspiration; when effected, keep it up two or three hours. After this a tepid-bath 62° with friction. The success of the treatment very much depends upon drinking abundantly of water. The bandages used, should be doubled or trebled, and changed often. If patient is unable to stand or sit upright, lay him on a bed, and let several attendants rub him all over with wet hands. Extract from a letter from Dr. Gibbs to the editor of the “Water-cure Journal.” “You cannot have forgotten the consternation of the profession when this fearful disease invaded us in 1832. Neither can you be ignorant that the faculty, generally, are as ill prepared to contend with it now as they were in former years; but for the information of those who may not be as well acquainted with such matters as you must be, I beg to make an extract from the minutes of the proceedings at a meeting of the Western Medical and Surgical Association, as reported in the Lancet of September 19, 1846. In the course of a discussion on the treatment of cholera, Dr. Cahill said, that he ‘positively felt a creeping of the skin at the relation of the enormities which had been perpetrated by practitioners upon their patients. When he listened to the recital of practitioners who described the extravagant cases of mercury and of opium which they administered, he could not refrain from fancying that he was witnessing the orgies of so many Indian savages, whilst counting the scalps of their victims. He thought it a pity that the invention of such a system of
  • 77. torture should not experience the fate of the inventor of the brazen bull, and illustrate upon his own person the efficacy of his infernal ingenuity. He believed that in the majority of persons who died of Asiatic cholera, death was the consequence of the treatment rather than of the disease. He had seen above a thousand cases of Asiatic cholera; and in no instance had he seen any benefit from any mode of treatment. On the contrary, he had seen persons die of narcotism, who would have survived if left to the vis medicatrix naturæ. He had seen others die of absorption of air through the veins when the saline fluid was ejected; and he knew many who had had the extraordinary luck to escape both the doctor and the disease, yet rendered miserable for the remainder of life by the effects of the immense doses of mercury which had been given to them during the cholera paroxysms. In fact, it was afflicting to contemplate the sufferings which the rash and empirical practice of the profession in the management of this epidemic had created.’The learned gentleman likewise said ‘With respect to cholera, since nothing was known of its nature, and no treatment had any influence over it, the best plan was to do as little as possible: give carrara, soda, or pump- water, with a little laudanum, perhaps in the diarrhœal stage, and the patient would not be deprived of the chance which nature had given him.’ “It is to be presumed that the doctor had not seen this disease treated by the Water-cure, under the operation of which, if I am correctly informed, and as I can readily believe, results very different from those, which he witnessed, were obtained. It is stated that more than twenty cases were successfully treated by Priessnitz, and between thirty and forty at Breslau, by a clergyman, whose name I regret that I have forgotten; and it is added that neither practitioner lost a patient by death. The treatment adopted by each of them was nearly the same; the principal difference between them being, that the one employed the sitz-bath, and the other the shallow tepid- bath. [“]If on the appearance of the premonitory symptoms, judicious treatment be promptly adopted, it seems not improbable that the
  • 78. disease may be cut short. Those symptoms may be any combination of the following:—shivering, dizziness, a ringing noise in the ears, a small quick pulse, accelerated respiration, languor, præcordial anxiety, a cold white tongue, nausea, vomiting, severe gripings, and watery diarrhœa. If it be not checked, the disease quickly passes into the second or algid stage; the circulation becomes feeble, the blood is drained of its fluid, the muscles are contracted and cramped, the tongue is colder and whiter, the thirst becomes burning, the lips livid; the features contracted, the extremities shrivelled, and the skin cold, clammy, and discoloured. “Little is known respecting the nature of this disease; but the most rational opinion seems to be, that it owes its origin to a poison pervading the blood; deranging the balance between the arterial and venous circulation, impairing the nervous energy, and impeding all the functions of the various organs, excepting the secretions from the stomach and bowels; the preternatural excitement of which would seem to indicate an effort of nature to expel the disturbing causes from the system. This opinion obtains additional probability from the fact, which often has been observed, that the more profuse is the diarrhœa, the less fatal is the disease. “Cholera may suddenly appear without manifesting any, or at least with very slight, premonitory symptoms; especially where the patient is labouring under any serious affection of the brain, lungs, or air- passages, when it will sometimes graft itself on the primary disease, and aggravate all its most various symptoms. “On the first manifestation of premonitory symptoms, immediate recourse should be had to repeated friction in a wrung-out sheet, as in the earlier stages of fever. This will tend to stimulate the nervous energy, and to maintain or re-establish the balance of circulation between the arterial and venous systems; will counteract the disposition to internal congestion by promoting cuticular circulation; will aid the lungs by freeing the exhalants of the skin, and will forward the elimination of the virus through the same channels.
  • 79. “But it will not be sufficient merely to attempt to resist the encroachments of the disease; the efforts of nature to expel the cause of it, also claim assistance. To this end cold or tepid water should be freely drunk to facilitate the vomiting, to dilute and weaken the action of the poison, to stimulate the kidneys, and to supply the waste of fluid in the blood. Dr. Rutty, in his synopsis, says, ‘It [the drinking of water] has also frequently been found efficacious in stopping violent vomitings and purgings, partly as a diluent, and partly as a bracer to the fibres; and in violent, deplorable choleras, cold water is recommended by the ancients, and at this time is ordered by Spanish physicians with good success, though Celsus orders it warm.’ “Enemata of pure water, tepid or cold, should likewise be freely administered; the quantity administered to an infant at one time should not exceed two ounces; four ounces would be sufficient for a child six years old; eight ounces for a youth of fifteen, and fifteen or sixteen ounces for an adult. “But the principal process is long and entire friction, either in the shallow tepid-bath or in the sitz-bath. The latter seems to deserve the preference, inasmuch as it will more directly and powerfully aid nature in her efforts; its primary action being that of a purgative, while a less body of water will suffice, than could be made to fulfil the same intention in a vessel of the shape and size of the half bath; but, if the sitz-bath be employed, then friction with wet hands should be applied to the extremities. Cold water may be used in the sitz-bath, provided that there is nothing in the previous state of the patient to contra-indicate its use; in which case tepid water must be employed. Tepid water about 70° Fahr. may likewise be employed in the shallow bath, as the body of water therein must be greater than the sitz-bath; but warm applications are never indicated. Vapour- baths have been tried to recall the circulation to the surface, but without effect. On this point, Dr. Daun in his ‘Medical Reports on Cholera,’ says, ‘O’Brien lay on the steam couch for three hours before he expired, in a heat that I am convinced would have raised a lifeless body to a temperature nearly, if not equal, to that of a
  • 80. Welcome to our website – the perfect destination for book lovers and knowledge seekers. We believe that every book holds a new world, offering opportunities for learning, discovery, and personal growth. That’s why we are dedicated to bringing you a diverse collection of books, ranging from classic literature and specialized publications to self-development guides and children's books. More than just a book-buying platform, we strive to be a bridge connecting you with timeless cultural and intellectual values. With an elegant, user-friendly interface and a smart search system, you can quickly find the books that best suit your interests. Additionally, our special promotions and home delivery services help you save time and fully enjoy the joy of reading. Join us on a journey of knowledge exploration, passion nurturing, and personal growth every day! ebookbell.com