Creating a new thread is pretty straightforward.
Here's a simple code
fragment that downloads a set of Web pages in parallel. For each
request it's given, the code creates a separate thread that handles
the HTTP transaction.
require 'net/http'
pages = %w( www.rubycentral.com
www.awl.com
www.pragmaticprogrammer.com
)
threads = []
for page in pages
threads << Thread.new(page) { |myPage|
h = Net::HTTP.new(myPage, 80)
puts "Fetching: #{myPage}"
resp, data = h.get('/', nil )
puts "Got #{myPage}: #{resp.message}"
}
end
threads.each { |aThread| aThread.join }
|
produces:
Fetching: www.rubycentral.com
Fetching: www.awl.com
Fetching: www.pragmaticprogrammer.com
Got www.rubycentral.com: OK
Got www.pragmaticprogrammer.com: OK
Got www.awl.com: OK
|
Let's look at this code in more detail, as there are a few subtle
things going on.
New threads are created with the
Thread.new
call. It is given a
block that contains the code to be run in a new thread. In our case,
the block uses the
net/http
library to fetch the top page from
each of our nominated sites. Our tracing clearly shows that these
fetches are going on in parallel.
When we create the thread, we pass the required HTML page in as a
parameter. This parameter is passed on to the block as
myPage
.
Why do we do this, rather than simply using the value of the variable
page
within the block?
A thread shares all global, instance, and local variables that are in
existence at the time the thread starts.
As anyone with a kid brother
can tell you, sharing isn't always a good thing. In this case, all
three threads would share the variable
page
. The first thread
gets started, and
page
is set to
https://www.rubycentral.com. In
the meantime, the loop creating the threads is still running. The
second time around,
page
gets set to
https://www.awl.com. If the
first thread has not yet finished using the
page
variable, it
will suddenly start using its new value. These bugs are difficult to
track down.
However, local variables created within a thread's block are truly
local to that thread---each thread will have its own copy of these
variables. In our case, the variable
myPage
will be set at the
time the thread is created, and each thread will have its own copy of
the page address.