Discussion:
Question about message passing paradigm
(too old to reply)
unknown
2008-06-29 05:06:25 UTC
Permalink
Hello,

I was not sure where best to post this and hope someone is able to
help with this question.

After being bitten by the pitfalls of lock-oriented multi threading I
am interested in switching to message passing oriented concurrency.

I have read that erlang has a per-process ordering guarantee (that is
if A sends messages 1 and 2 to B 1 will arrive before 2 at B. However,
there is no guarantee that messages from C and D will not be placed in
between 1 and 2.

So my question is this:
In my current lock oriented program design I have threads dedicated to
managing different collections and actions on those collections
(hashmaps), lets call them A, B, C, and D. At several points in the
program one thread needs to get data from the other collections and
make a decision based on the collective state of the values it gathers
form those other collections.

An example situation would be A needing data from B, C, and D. In that
case A would attempt to lock A, B, C, and D then gather the items it
needs and release the lock.

How would I accomplish this same task in a message passing manner?

I had though of doing: A sends a message to B, C, and D asking for the
data. However, B, C, and D may have each changed independently of each
other by the time they receive the request for data and/or by the time
they are able to send the message to A. So, how do you deal with
situations like this in a message passing paradigm?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20080628/5929b165/attachment.html>
unknown
2008-06-29 10:42:04 UTC
Permalink
I'm a bit of an Erlang newbie so the gurus may come up with something much
better than this, but I can think of these possible approaches:

1. Create a gen_fsm that controls all the collections. The collections
could be ETS tables or gen_servers wrapping ETS tables. Under normal use,
messages are sent to the fsm to update the collections individually. When
the time comes to require consistency across the collections, send a message
to the fsm to get the collective state data. The fsm goes into a different
state while it gathers the data. This state would reject requests to update
the collections (or wait until the state changes), although reads would
still be allowed. On getting the result, the state changes back to allow
updates again.
2. Create a memory-only Mnesia table for each collection, and use Mnesia
transactions to get the multiple values atomically.
3. Change the architecture of the current lock-oriented program to make
better use of Erlang's features.

Hope this helps.
Post by unknown
Hello,
I was not sure where best to post this and hope someone is able to
help with this question.
After being bitten by the pitfalls of lock-oriented multi threading I
am interested in switching to message passing oriented concurrency.
I have read that erlang has a per-process ordering guarantee (that is
if A sends messages 1 and 2 to B 1 will arrive before 2 at B. However,
there is no guarantee that messages from C and D will not be placed in
between 1 and 2.
In my current lock oriented program design I have threads dedicated to
managing different collections and actions on those collections
(hashmaps), lets call them A, B, C, and D. At several points in the
program one thread needs to get data from the other collections and
make a decision based on the collective state of the values it gathers
form those other collections.
An example situation would be A needing data from B, C, and D. In that
case A would attempt to lock A, B, C, and D then gather the items it
needs and release the lock.
How would I accomplish this same task in a message passing manner?
I had though of doing: A sends a message to B, C, and D asking for the
data. However, B, C, and D may have each changed independently of each
other by the time they receive the request for data and/or by the time
they are able to send the message to A. So, how do you deal with
situations like this in a message passing paradigm?
_______________________________________________
erlang-questions mailing list
erlang-questions
http://www.erlang.org/mailman/listinfo/erlang-questions
--
The great enemy of the truth is very often not the lie -- deliberate,
contrived and dishonest, but the myth, persistent, persuasive, and
unrealistic. Belief in myths allows the comfort of opinion without the
discomfort of thought.
John F. Kennedy 35th president of US 1961-1963 (1917 - 1963)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20080629/4a396bc8/attachment.html>
unknown
2008-06-30 06:45:07 UTC
Permalink
What about:

4. Send a message to A, containing both the actual request for A, and a
function FA to be executed by A. This function FA will send a message to
B, containing the actual request for A, and a function FB to be executed
by B. This function FB will send a message to C, containing the actual
request for C, and a function FC to be executed by C. This function FC
will send a message to D, containing the actual request for D, and
possibly a function FD that does nothing. All the processes wait for the
answer to their messages, of course. This way, A will not process
anything (and therefore not change state), while waiting for B, which
waits for C, which waits for D. B will not change state while waiting
for C and D, and C will not change state while waiting for D. This is
like locking A, then locking B, then C, then D, then releasing D, C, B
and A.

Beware of cyclic "locks" in this approach!

*Erik.



________________________________

From: erlang-questions-bounces
[mailto:erlang-questions-bounces] On Behalf Of Edwin Fine
Sent: Sunday, June 29, 2008 12:42 PM
To: Mike T
Cc: erlang-questions
Subject: Re: [erlang-questions] Question about message passing
paradigm


I'm a bit of an Erlang newbie so the gurus may come up with
something much better than this, but I can think of these possible
approaches:


1. Create a gen_fsm that controls all the collections. The
collections could be ETS tables or gen_servers wrapping ETS tables.
Under normal use, messages are sent to the fsm to update the collections
individually. When the time comes to require consistency across the
collections, send a message to the fsm to get the collective state data.
The fsm goes into a different state while it gathers the data. This
state would reject requests to update the collections (or wait until the
state changes), although reads would still be allowed. On getting the
result, the state changes back to allow updates again.
2. Create a memory-only Mnesia table for each collection,
and use Mnesia transactions to get the multiple values atomically.
3. Change the architecture of the current lock-oriented
program to make better use of Erlang's features.


Hope this helps.


2008/6/29 Mike T <talmage.news>:


Hello,

I was not sure where best to post this and hope someone
is able to
help with this question.

After being bitten by the pitfalls of lock-oriented
multi threading I
am interested in switching to message passing oriented
concurrency.

I have read that erlang has a per-process ordering
guarantee (that is
if A sends messages 1 and 2 to B 1 will arrive before 2
at B. However,
there is no guarantee that messages from C and D will
not be placed in
between 1 and 2.

So my question is this:
In my current lock oriented program design I have
threads dedicated to
managing different collections and actions on those
collections
(hashmaps), lets call them A, B, C, and D. At several
points in the
program one thread needs to get data from the other
collections and
make a decision based on the collective state of the
values it gathers
form those other collections.

An example situation would be A needing data from B, C,
and D. In that
case A would attempt to lock A, B, C, and D then gather
the items it
needs and release the lock.

How would I accomplish this same task in a message
passing manner?

I had though of doing: A sends a message to B, C, and D
asking for the
data. However, B, C, and D may have each changed
independently of each
other by the time they receive the request for data
and/or by the time
they are able to send the message to A. So, how do you
deal with
situations like this in a message passing paradigm?
_______________________________________________
erlang-questions mailing list
erlang-questions
http://www.erlang.org/mailman/listinfo/erlang-questions





--
The great enemy of the truth is very often not the lie --
deliberate, contrived and dishonest, but the myth, persistent,
persuasive, and unrealistic. Belief in myths allows the comfort of
opinion without the discomfort of thought.
John F. Kennedy 35th president of US 1961-1963 (1917 - 1963)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20080630/f5525e43/attachment.html>
unknown
2008-07-01 03:33:27 UTC
Permalink
The problem we are discussing is
processes B, C, D hold information X, Y, Z respectively;
process A wants a coherent snapshot of X, Y,Z.

There are actually two slightly different cases depending on
A needs "X Y Z as of *now*" (A, B, C, and D must all
synchronise), or A needs "X Y Z as of *some* time in the
recent past" (B, C, and D must all synchronise but can then
send the information to A without waiting for A to receive it).

I like this problem because it is simple yet subtle.
One way that it is subtle is that in "multithread" programming
most people STILL think in terms of a single absolute time
shared by all threads. (After all, there _is_ such a thing,
the system clock. And yes, it's not exactly true, but it's
close enough to make people _think_ it's true.) But when you
start thinking about Erlang and especially *distributed*
Erlang, you start realising that "now" is a pretty fuzzy
concept.

To me, the easiest approach to *think* was

A sends "tell me X" to B, "tell me Y" to C,
and "tell me Z" to D in any order,
then does three receives in any order to
collect the results, then
sends "thanks" to B, C, and D in any order.

B receives "tell me X" from A, computes X,
sends "here is X" to A, then waits for
"thanks" from A.

C and D are just like B.

This has 9 messages compared, but it is symmetric, and it is
actually pretty close to the locking technique (to "lock" a
process, send it a message asking it to wait, receive its
acceptance, and then send it a message telling it to proceed).
This extends to any number of processes E, F, G, ... and none
of the processes except A has to know about any of the others.

On 30 Jun 2008, at 6:45 pm, Erik Reitsma proposed a scheme
that doesn't actually need any functions to be sent around
(which is nice if you are doing distribution):

A sends "I need X Y and Z, ask C and D for Y and Z"
to B then waits.
B sends "A needs Y and Z, ask D for Z, here is X"
to C. B then waits for the go-ahead.
C sends "A needs Z, here are X and Y" to D,
then waits.
D sends "here are X Y and Z" to A.
If a four-way synchronisation is wanted, D waits.
If a three-way is wanted, it doesn't.
A receives X, Y, and Z. It sends "go ahead" to B
and C (and to D if four-way is wanted).

This is 7 messages if the "NOW" reading with four-way
synchronisation is wanted, or 6 if the "some time in the
recent past" reading with three-way synchronisation is
wanted. Since B, C, and D have to be told that their
information is needed and A has to receive the results,
four messages are needed.

Can it be done in fewer than 7 (or 6) messages?
Are there other readings of the requirements that I've missed?
How do/should we think about these problems?

Pace Dijkstra, I'm afraid that I came up with the schemes
above in a very anthropomorphic CRC-card-ish way:
who knows what?
who needs to know what?
can the message count be reduced if I ask
someone to do something for me?
plus a sort of intuitive idea that
to get N processes to synchronise, get all
but 1 of them waiting for something

There has to be a tutorial, perhaps even a textbook, waiting
to be written about "how to think about messages".
unknown
2008-07-01 03:42:51 UTC
Permalink
Richard,

I'm new to Erlang and FP, and I want to learn from my mistakes or
misunderstandings. Please could you critique the suggestion I sent in
regarding this problem? The fact that nobody commented means either that it
was (a) totally naive and not worthy of comment (or to spare my feelings),
or (b) such a good idea that all were rendered speechless with admiration.
Somehow the probability of (b) seems rather low. So... where did I go wrong?

Regards,
Edwin FIne

On Mon, Jun 30, 2008 at 11:33 PM, Richard A. O'Keefe <ok>
Post by unknown
The problem we are discussing is
processes B, C, D hold information X, Y, Z respectively;
process A wants a coherent snapshot of X, Y,Z.
There are actually two slightly different cases depending on
A needs "X Y Z as of *now*" (A, B, C, and D must all
synchronise), or A needs "X Y Z as of *some* time in the
recent past" (B, C, and D must all synchronise but can then
send the information to A without waiting for A to receive it).
I like this problem because it is simple yet subtle.
One way that it is subtle is that in "multithread" programming
most people STILL think in terms of a single absolute time
shared by all threads. (After all, there _is_ such a thing,
the system clock. And yes, it's not exactly true, but it's
close enough to make people _think_ it's true.) But when you
start thinking about Erlang and especially *distributed*
Erlang, you start realising that "now" is a pretty fuzzy
concept.
To me, the easiest approach to *think* was
A sends "tell me X" to B, "tell me Y" to C,
and "tell me Z" to D in any order,
then does three receives in any order to
collect the results, then
sends "thanks" to B, C, and D in any order.
B receives "tell me X" from A, computes X,
sends "here is X" to A, then waits for
"thanks" from A.
C and D are just like B.
This has 9 messages compared, but it is symmetric, and it is
actually pretty close to the locking technique (to "lock" a
process, send it a message asking it to wait, receive its
acceptance, and then send it a message telling it to proceed).
This extends to any number of processes E, F, G, ... and none
of the processes except A has to know about any of the others.
On 30 Jun 2008, at 6:45 pm, Erik Reitsma proposed a scheme
that doesn't actually need any functions to be sent around
A sends "I need X Y and Z, ask C and D for Y and Z"
to B then waits.
B sends "A needs Y and Z, ask D for Z, here is X"
to C. B then waits for the go-ahead.
C sends "A needs Z, here are X and Y" to D,
then waits.
D sends "here are X Y and Z" to A.
If a four-way synchronisation is wanted, D waits.
If a three-way is wanted, it doesn't.
A receives X, Y, and Z. It sends "go ahead" to B
and C (and to D if four-way is wanted).
This is 7 messages if the "NOW" reading with four-way
synchronisation is wanted, or 6 if the "some time in the
recent past" reading with three-way synchronisation is
wanted. Since B, C, and D have to be told that their
information is needed and A has to receive the results,
four messages are needed.
Can it be done in fewer than 7 (or 6) messages?
Are there other readings of the requirements that I've missed?
How do/should we think about these problems?
Pace Dijkstra, I'm afraid that I came up with the schemes
who knows what?
who needs to know what?
can the message count be reduced if I ask
someone to do something for me?
plus a sort of intuitive idea that
to get N processes to synchronise, get all
but 1 of them waiting for something
There has to be a tutorial, perhaps even a textbook, waiting
to be written about "how to think about messages".
--
The great enemy of the truth is very often not the lie -- deliberate,
contrived and dishonest, but the myth, persistent, persuasive, and
unrealistic. Belief in myths allows the comfort of opinion without the
discomfort of thought.
John F. Kennedy 35th president of US 1961-1963 (1917 - 1963)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20080630/5455b955/attachment.html>
unknown
2008-07-01 04:48:35 UTC
Permalink
Post by unknown
Richard,
I'm new to Erlang and FP, and I want to learn from my mistakes or
misunderstandings. Please could you critique the suggestion I sent
in regarding this problem? The fact that nobody commented means
either that it was (a) totally naive and not worthy of comment (or
to spare my feelings), or (b) such a good idea that all were
rendered speechless with admiration. Somehow the probability of (b)
seems rather low. So... where did I go wrong?
Let's recapitulate. If I've found the right message,
Edwin Fine had three suggestions:

? Create a gen_fsm that controls all the collections.
The collections could be ETS tables or gen_servers wrapping ETS
tables.
Under normal use, messages are sent to the fsm to update the
collections
individually. When the time comes to require consistency across the
collections, send a message to the fsm to get the collective state
data.
The fsm goes into a different state while it gathers the data.
This state would reject requests to update the collections (or wait
until
the state changes), although reads would still be allowed.
On getting the result, the state changes back to allow updates again.
? Create a memory-only Mnesia table for each collection, and use Mnesia
transactions to get the multiple values atomically.
? Change the architecture of the current lock-oriented program to make
better use of Erlang's features.

From bottom to top:
- The last one doesn't really answer the orignal poster's question.
At least, imagining myself in the OP's shoes, I would not find that
answer informative. Change it HOW? WHICH features? In what way
better?

- The second one might well be the right thing to do in a production
system. However, someone who is still struggling with how to use
messages
probably doesn't want to be told to learn another huge great
thing, and
might well get the impression that message passing wasn't much good
after all.

- As for the first one, it seemed to me that in order to do that,
you'd
pretty much have to solve the original problem anyway, plus you
would
have to come to grips with behaviours, callbacks, gen_fsm, and a
lot of
stuff which is practically very very useful, but not something to be
understood in five minutes.

I suppose I can summarise it as "You gave OTP answers to what I saw as
an Erlang question". I may well be completely mistaken about where the
OP was coming from, but I understood the question to be specifically a
question "how can the OP solve this problem directly using message
passing."
OTP answers are very often precisely the right answers, so don't stop
giving them.
unknown
2008-07-01 07:37:20 UTC
Permalink
Thank you for a very interesting and informative analysis. I must admit that
I tend to lump together Erlang/OTP as "Erlang" and see solutions in that
context. In a way, I feel as if using Erlang "standalone" without OTP is
very roughly analogous to using C++ without the STL: it can be done, but why
on earth would one want to?

I agree that the last answer didn't answer much. What I was trying to say is
that the architecture should perhaps be redesigned to fit a Functional
Programming/Message Passing paradigm and infrastructure, but I couldn't
suggest precisely how because there was almost no information about the
original architecture. So in retrospect I wound up saying nothing useful.

As a final note, let me say that I was thrown in to the deep end in Erlang
by having to learn Erlang and OTP concurrently (pun intended) because I had
to create a production system in a short period of time. This is why it's a
bit hard for me to separate the two.

Thanks again for clearly identifying the core issues, as you usually seem to
do.

<off-topic>One last thing: I read the Ethics of Belief after poring over one
of your posts recently, and was exceptionally impressed with the gentleman's
writing and philosophy, with which I strongly agree. In that regard, I'd
like to misquote John Stuart Mill, namely, "A foolish *certainty* is the
hobgoblin of little minds". (Actually, I think my version is a slight
improvement ;) I am unfortunately seeing the mechanism of insufficiently
examined beliefs at work today, resulting in the persecution of a friend of
mine by way of (the almost totally belief-based) Shaken Baby Syndrome. So
the essay really resonated deeply with me. I daresay William K. Clifford
would have had a lot to say about this. I wish he were still alive to do so.
Bertrand Russel too.</offtopic>

On Tue, Jul 1, 2008 at 12:48 AM, Richard A. O'Keefe <ok>
Post by unknown
Post by unknown
Richard,
I'm new to Erlang and FP, and I want to learn from my mistakes or
misunderstandings. Please could you critique the suggestion I sent in
regarding this problem? The fact that nobody commented means either that it
was (a) totally naive and not worthy of comment (or to spare my feelings),
or (b) such a good idea that all were rendered speechless with admiration.
Somehow the probability of (b) seems rather low. So... where did I go wrong?
Let's recapitulate. If I've found the right message,
? Create a gen_fsm that controls all the collections.
The collections could be ETS tables or gen_servers wrapping ETS
tables.
Under normal use, messages are sent to the fsm to update the collections
individually. When the time comes to require consistency across the
collections, send a message to the fsm to get the collective state data.
The fsm goes into a different state while it gathers the data.
This state would reject requests to update the collections (or
wait until
the state changes), although reads would still be allowed.
On getting the result, the state changes back to allow updates again.
? Create a memory-only Mnesia table for each collection, and use Mnesia
transactions to get the multiple values atomically.
? Change the architecture of the current lock-oriented program to make
better use of Erlang's features.
- The last one doesn't really answer the orignal poster's question.
At least, imagining myself in the OP's shoes, I would not find that
answer informative. Change it HOW? WHICH features? In what way better?
- The second one might well be the right thing to do in a production
system. However, someone who is still struggling with how to use
messages
probably doesn't want to be told to learn another huge great thing, and
might well get the impression that message passing wasn't much good
after all.
- As for the first one, it seemed to me that in order to do that, you'd
pretty much have to solve the original problem anyway, plus you would
have to come to grips with behaviours, callbacks, gen_fsm, and a lot of
stuff which is practically very very useful, but not something to be
understood in five minutes.
I suppose I can summarise it as "You gave OTP answers to what I saw as
an Erlang question". I may well be completely mistaken about where the
OP was coming from, but I understood the question to be specifically a
question "how can the OP solve this problem directly using message
passing."
OTP answers are very often precisely the right answers, so don't stop
giving them.
--
The great enemy of the truth is very often not the lie -- deliberate,
contrived and dishonest, but the myth, persistent, persuasive, and
unrealistic. Belief in myths allows the comfort of opinion without the
discomfort of thought.
John F. Kennedy 35th president of US 1961-1963 (1917 - 1963)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20080701/2c3a0be1/attachment.html>
unknown
2008-07-02 00:20:57 UTC
Permalink
Post by unknown
Thank you for a very interesting and informative analysis. I must
admit that I tend to lump together Erlang/OTP as "Erlang" and see
solutions in that context. In a way, I feel as if using Erlang
"standalone" without OTP is very roughly analogous to using C++
without the STL: it can be done, but why on earth would one want to?
Because the STL
- is still not fully portable between compilers;
in theory it should not be, but it takes you into
deep template territory where compilers have
incompatible bugs (though they are improving)
- is in my experience less efficient than home-brew
code (the STL relies, like Haskell, on *serious*
optimisation which compilers do not always do)
- gets you some of the most incomprehensible
compiler error messages (un)imaginable when you
make mistakes, which you always do, because
- it is unpleasantly complex to use (this will, I
hope, be remedied in C0x, at least if I can
trust Stroustrup's summary of what's going to be
there).

None of these applies to OTP, but the argument that

- it is another large and complex body of material
to master on top of something itself unfamiliar.

does apply to both. I have a colleague who has done
serious commercial software development in C++ and
flatly refuses to have anything to do with the STL
(see reasons above).

Let's take the current example and see if we can squeeze a
bit more out of it. Tim Watson pointed out that there is
an issue about "locking" and failing processes. My take
on this is "let it crash": when processes are coupled like
this they should be as a rule be linked and if one dies,
all should die. The OTP behaviours take care of that kind
of thing, which is why *once you have grasped the basics*
you should try them before rolling your own. What's really
interesting here is that the original system was written in
terms of threads and locking. Have you looked at thread
interfaces lately?

[Single Unix Specification version 3]

pthread_mutex_lock(&mutex)
pthread_mutex_unlock(&mutex)

There are four kinds of mutex in POSIX threads:
PTHREAD_MUTEX_NORMAL
recursive locking deadlocks
unlocking a mutex that is already unlocked
or that is locked by another => undefined
PTHREAD_MUTEX_ERRORCHECK
these conditions return an error code
PTHREAD_MUTEX_RECURSIVE
recursive locking works
unlocking a mutex that is already unlocked
or that is locked by another => error code
PTHREAD_MUTEX_DEFAULT
same as PTHREAD_MUTEX_NORMAL
Anything else
All behaviour undefined

[Solaris 2.10]
If
(1) _POSIX_THREAD_PRIO_INHERIT is defined
(2) the mutex was initialised with protocol
attribute PTHREAD_PRIO_INHERIT
(3) the mutex was initialised with robustness
attribute PTHREAD_MUTEX_ROBUST_NP
(4) the last holder of the lock crashed
then
(A) an attempt to lock the mutex will 'fail'
with the error code EOWNERDEAD
(B) but in fact the attempt will have succeeded
(C) it is up to the *new* owner of the lock to
try to clean up the state
(D) if it can, it calls pthread_mutex_consistent_np
(E) if it crashes, the next locker will get the
same error code and the same chance to recover
(F) if it can't, it should unlock the mutex, and
future lockers will get a *different* error code
(ENOTRECOVERABLE).
(G) it is possible to call pthread_mutex_consistent_np
on mutexes that aren't held or didn't need
recovery and the behaviour is undefined

If a mutex with the default PTHREAD_MUTEX_STALLED_NP
robustness value is held by a thread that dies,
future locks are "blocked in an unspecified manner".
What this means in practice I'm not sure.

If you reckon the Solaris 2.10 treatment of crashed
lock holders is a mess, perhaps you can point me to
something better in SUSv3. I can't find anything in
SUSv3 to say _what_ happens when a lock owner crashes.
(And don't get me onto the subject of cancellation points.)


What kind of building block is _this_ for building
reliable systems?

Message passing plus linking is so much easier to
have justified confidence in that pthreads and TBB start
to look like extremely sick jokes.
Post by unknown
<off-topic>One last thing: I read the Ethics of Belief after poring
over one of your posts recently, and was exceptionally impressed
with the gentleman's writing and philosophy, with which I strongly
agree. In that regard, I'd like to misquote John Stuart Mill,
namely, "A foolish certainty is the hobgoblin of little minds".
(Actually, I think my version is a slight improvement ;) I am
unfortunately seeing the mechanism of insufficiently examined
beliefs at work today, resulting in the persecution of a friend of
mine by way of (the almost totally belief-based) Shaken Baby
Syndrome. So the essay really resonated deeply with me. I daresay
William K. Clifford would have had a lot to say about this. I wish
he were still alive to do so. Bertrand Russel too.</offtopic>
<also-off-topic>Clifford takes a really good idea and pushes it beyond
the bounds
of reason; he manages, presumably without intending to, to make any
belief in
science ethically unjustifiable. Specifically and in Clifford's day
topically,
Clifford's rule about believing other people meant that it would have
been *bad*
for anyone to believe in Natural Selection. I encountered Clifford's
paper in
a book by DeMarco and someone else about risk management in software
engineering.
I find the idea that Shaken Baby is still credited deeply upsetting;
please convey
my sympathy and good wishes to your friend.
</also-off-topic>
unknown
2008-07-02 01:58:26 UTC
Permalink
On Tue, Jul 1, 2008 at 8:20 PM, Richard A. O'Keefe <ok>
Post by unknown
Post by unknown
Thank you for a very interesting and informative analysis. I must admit
that I tend to lump together Erlang/OTP as "Erlang" and see solutions in
that context. In a way, I feel as if using Erlang "standalone" without OTP
is very roughly analogous to using C++ without the STL: it can be done, but
why on earth would one want to?
Because the STL
- is still not fully portable between compilers;
in theory it should not be, but it takes you into
deep template territory where compilers have
incompatible bugs (though they are improving)
- is in my experience less efficient than home-brew
code (the STL relies, like Haskell, on *serious*
optimisation which compilers do not always do)
- gets you some of the most incomprehensible
compiler error messages (un)imaginable when you
make mistakes, which you always do, because
- it is unpleasantly complex to use (this will, I
hope, be remedied in C0x, at least if I can
trust Stroustrup's summary of what's going to be
there).
Maybe after using it for a long time, it's doesn't seem that complicated. At
least, not compared to C++ itself. C++ itself is IMHO horribly complex and
arguably a language to be used only by C++ experts. Just writing
exception-safe code alone can be dreadfully difficult, never mind the
powerful yet labyrinthine maze that is called templates. I've now used the
STL extensively for many years, and it has a large advantage in that it is
more or less supported by all compliant compilers (and "free" as in beer).
It pretty much does the trick. Initially, the ability of various compilers
to handle templates was pretty bad and inconsistent so I rolled my own
portable libraries. I had to write a significant amount of commercial C++
code that had to be ported across OS/390, HP/UX, Windows, Solaris, AIX, and
Linux, so I have a feel for the issues involved. I found it amusing that
Andrei Alexandrescu could write syntactically valid template code that no
C++ compilers could parse at that time. After a while, though, the effort of
maintaining (testing, documenting, fixing) all that code across all those
systems became too much and I used the STL exclusively after that. I had to
choose a lowest common denominator approach and avoid doing anything really
fancy, but it was still worth it.
The STL was efficient enough for most purposes. In the IT department of a
large telco where I worked for some years, trying to improve quality and
establish standards, the code was so inefficient that the benefits of using
the STL far outweighed any efficiency concerns. They didn't even compile the
code with optimization when it went into production, so the overhead of
those templates must have been quite significant, yet it didn't even
register on the radar compared to the big concerns (like, actually getting
the code to work, and getting the developers to understand why it's slow to
do a linear lookup in a vector of 100,000 items, in a loop that iterates
50,000 times). When you're in the trenches, you are grateful for anything
that works as advertised most of the time. Gladly, I am out of the C++
business right now and happily learning the Erlang world, which is a joy in
comparison.
Post by unknown
None of these applies to OTP, but the argument that
- it is another large and complex body of material
to master on top of something itself unfamiliar.
does apply to both.
Agreed.
Post by unknown
I have a colleague who has done
serious commercial software development in C++ and
flatly refuses to have anything to do with the STL
(see reasons above).
I did, too, until a few years ago (see above).
Post by unknown
Let's take the current example and see if we can squeeze a
bit more out of it. Tim Watson pointed out that there is
an issue about "locking" and failing processes. My take
on this is "let it crash": when processes are coupled like
this they should be as a rule be linked and if one dies,
all should die. The OTP behaviours take care of that kind
of thing, which is why *once you have grasped the basics*
you should try them before rolling your own. What's really
interesting here is that the original system was written in
terms of threads and locking. Have you looked at thread
interfaces lately?
Unfortunately, I have worked a significant amount with pthreads, Java
threads, Windows threads, and OS/2 threads, and consider that skill set to
be in the same domain as C++: for experts (as opposed to the "average"
developer found in IT shops and who knows how many software companies) only.
The fact that we are now seeing multicore processors and are being exhorted
to write more thread-bearing code makes me shudder. I suspect that most of
the readers of this newsgroup are highly skilled developers, hopefully
working for great companies, and I wonder if they have been exposed to the
horrors of the "average" developer (as evidenced in "The Daily WTF" and
similar web sites). I hope I am not coming across conceited; I'm not like
that at all. It is just that I have spent 25 years working hard to be as
good a software guy (for want of a better description) as possible, and for
my troubles have almost always been put in a position where I have had to
teach and mentor the "average" developer (and fix literally millions of
lines of code, usualy C++). My experience (and others I am sure will have
differences of opinion) is that the level of the average developer is
trending steeply downwards as developers become a globalized commodity
("resource") and compete more on price than on ability. These days, I find
it rare to see halfway decent code written in ANY language, and the thought
of zillions of lines of thread-bearing C++ and Java software being let loose
on the world by people who can't even spell O(N^2) scares the heck out of
me. Sorry if I am ranting a bit.
Post by unknown
[Single Unix Specification version 3]
pthread_mutex_lock(&mutex)
pthread_mutex_unlock(&mutex)
PTHREAD_MUTEX_NORMAL
recursive locking deadlocks
unlocking a mutex that is already unlocked
or that is locked by another => undefined
PTHREAD_MUTEX_ERRORCHECK
these conditions return an error code
PTHREAD_MUTEX_RECURSIVE
recursive locking works
unlocking a mutex that is already unlocked
or that is locked by another => error code
PTHREAD_MUTEX_DEFAULT
same as PTHREAD_MUTEX_NORMAL
Anything else
All behaviour undefined
[Solaris 2.10]
If
(1) _POSIX_THREAD_PRIO_INHERIT is defined
(2) the mutex was initialised with protocol
attribute PTHREAD_PRIO_INHERIT
(3) the mutex was initialised with robustness
attribute PTHREAD_MUTEX_ROBUST_NP
(4) the last holder of the lock crashed
then
(A) an attempt to lock the mutex will 'fail'
with the error code EOWNERDEAD
(B) but in fact the attempt will have succeeded
(C) it is up to the *new* owner of the lock to
try to clean up the state
(D) if it can, it calls pthread_mutex_consistent_np
(E) if it crashes, the next locker will get the
same error code and the same chance to recover
(F) if it can't, it should unlock the mutex, and
future lockers will get a *different* error code
(ENOTRECOVERABLE).
(G) it is possible to call pthread_mutex_consistent_np
on mutexes that aren't held or didn't need
recovery and the behaviour is undefined
If a mutex with the default PTHREAD_MUTEX_STALLED_NP
robustness value is held by a thread that dies,
future locks are "blocked in an unspecified manner".
What this means in practice I'm not sure.
If you reckon the Solaris 2.10 treatment of crashed
lock holders is a mess, perhaps you can point me to
something better in SUSv3. I can't find anything in
SUSv3 to say _what_ happens when a lock owner crashes.
(And don't get me onto the subject of cancellation points.)
What kind of building block is _this_ for building
reliable systems?
One made of clay and straw and baked in the sun.
Post by unknown
Message passing plus linking is so much easier to
have justified confidence in that pthreads and TBB start
to look like extremely sick jokes.
I couldn't agree with you more. One thing that amazed me about Erlang is
how, once I had figured out how to write something, it Just Worked (TM). I
fell in love.
Post by unknown
<off-topic>One last thing: I read the Ethics of Belief after poring over
Post by unknown
one of your posts recently, and was exceptionally impressed with the
gentleman's writing and philosophy, with which I strongly agree. In that
regard, I'd like to misquote John Stuart Mill, namely, "A foolish certainty
is the hobgoblin of little minds". (Actually, I think my version is a slight
improvement ;) I am unfortunately seeing the mechanism of insufficiently
examined beliefs at work today, resulting in the persecution of a friend of
mine by way of (the almost totally belief-based) Shaken Baby Syndrome. So
the essay really resonated deeply with me. I daresay William K. Clifford
would have had a lot to say about this. I wish he were still alive to do so.
Bertrand Russel too.</offtopic>
<also-off-topic>Clifford takes a really good idea and pushes it beyond the
bounds
of reason; he manages, presumably without intending to, to make any belief
in
science ethically unjustifiable. Specifically and in Clifford's day
topically,
Clifford's rule about believing other people meant that it would have been
*bad*
for anyone to believe in Natural Selection. I encountered Clifford's paper
in
a book by DeMarco and someone else about risk management in software
engineering.
I find the idea that Shaken Baby is still credited deeply upsetting; please
convey
my sympathy and good wishes to your friend.
</also-off-topic>
<rant>
I won't go off-topic after this, but I just wanted to write two things:

(a) Shaken Baby is sadly alive and persecuting parents and child carers in
the USA, Australia and England.
(b) After reading dozens of papers on the SBS myth, I mean syndrome, and
seeing the terrible cost to innocent families due to junk science, I rather
lean toward being too much a "Cliffordite" than too little. This is
particularly true after I read of scientific bias and just plain fraud (all
that lovely research money is so enticing). I actually took Clifford's
advice by following up on as many references as I could find in
peer-reviewed papers, and I was appalled at how, in many cases, results were
used selectively from these papers to back up shaky claims and make them
look solidly research-based. These days I am almost reluctant to accept the
evidence of my own eyes, having learned how deeply subjective "reality" is,
and how what is up today is down tomorrow. Galileo and many others were
persecuted for being "heretical" and later shown to be correct. Great "laws"
of nature (classical relativity -> special relativity -> general relativity
-> quantum mechanics -> string theory -> ??) were challenged and either
overturned or radically transformed by those brave and noble enough to
endure scorn and criticism. Heretics were burned at the stake not that long
ago; are we humans really any different now? Yes, I like Clifford. Rather be
too critical than not critical enough, even if it risks making beliefs in
science ethically unjustifiable: in the days of the Internet and easy claims
of anything, and people who will believe almost any plausible claims, we
need to be more vigilant than ever.</rant>
--
The great enemy of the truth is very often not the lie -- deliberate,
contrived and dishonest, but the myth, persistent, persuasive, and
unrealistic. Belief in myths allows the comfort of opinion without the
discomfort of thought.
John F. Kennedy 35th president of US 1961-1963 (1917 - 1963)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20080701/15f32a0a/attachment.html>
unknown
2008-07-02 04:27:56 UTC
Permalink
First of all: Wow, I never expected this much of a response.

I need some time to sort through the replies and will reply when I
have more time. I just wanted to write this short reply and say thank
you to everyone for being willing to help a newbie into this area.

Mike

On Jul 1, 6:58 pm, "Edwin Fine" <erlang-questions_ef...>
On Tue, Jul 1, 2008 at 8:20 PM, Richard A. O'Keefe <o...>
Post by unknown
Post by unknown
Thank you for a very interesting and informative analysis. I must admit
that I tend to lump together Erlang/OTP as "Erlang" and see solutions in
that context. In a way, I feel as if using Erlang "standalone" without OTP
is very roughly analogous to using C++ without the STL: it can be done, but
why on earth would one want to?
Because the STL
- is still not fully portable between compilers;
in theory it should not be, but it takes you into
deep template territory where compilers have
incompatible bugs (though they are improving)
- is in my experience less efficient than home-brew
code (the STL relies, like Haskell, on *serious*
optimisation which compilers do not always do)
- gets you some of the most incomprehensible
compiler error messages (un)imaginable when you
make mistakes, which you always do, because
- it is unpleasantly complex to use (this will, I
hope, be remedied in C0x, at least if I can
trust Stroustrup's summary of what's going to be
there).
Maybe after using it for a long time, it's doesn't seem that complicated. At
least, not compared to C++ itself. C++ itself is IMHO horribly complex and
arguably a language to be used only by C++ experts. Just writing
exception-safe code alone can be dreadfully difficult, never mind the
powerful yet labyrinthine maze that is called templates. I've now used the
STL extensively for many years, and it has a large advantage in that it is
more or less supported by all compliant compilers (and "free" as in beer).
It pretty much does the trick. Initially, the ability of various compilers
to handle templates was pretty bad and inconsistent so I rolled my own
portable libraries. I had to write a significant amount of commercial C++
code that had to be ported across OS/390, HP/UX, Windows, Solaris, AIX, and
Linux, so I have a feel for the issues involved. I found it amusing that
Andrei Alexandrescu could write syntactically valid template code that no
C++ compilers could parse at that time. After a while, though, the effort of
maintaining (testing, documenting, fixing) all that code across all those
systems became too much and I used the STL exclusively after that. I had to
choose a lowest common denominator approach and avoid doing anything really
fancy, but it was still worth it.
The STL was efficient enough for most purposes. In the IT department of a
large telco where I worked for some years, trying to improve quality and
establish standards, the code was so inefficient that the benefits of using
the STL far outweighed any efficiency concerns. They didn't even compile the
code with optimization when it went into production, so the overhead of
those templates must have been quite significant, yet it didn't even
register on the radar compared to the big concerns (like, actually getting
the code to work, and getting the developers to understand why it's slow to
do a linear lookup in a vector of 100,000 items, in a loop that iterates
50,000 times). When you're in the trenches, you are grateful for anything
that works as advertised most of the time. Gladly, I am out of the C++
business right now and happily learning the Erlang world, which is a joy in
comparison.
Post by unknown
None of these applies to OTP, but the argument that
- it is another large and complex body of material
to master on top of something itself unfamiliar.
does apply to both.
Agreed.
Post by unknown
I have a colleague who has done
serious commercial software development in C++ and
flatly refuses to have anything to do with the STL
(see reasons above).
I did, too, until a few years ago (see above).
Post by unknown
Let's take the current example and see if we can squeeze a
bit more out of it. Tim Watson pointed out that there is
an issue about "locking" and failing processes. My take
on this is "let it crash": when processes are coupled like
this they should be as a rule be linked and if one dies,
all should die. The OTP behaviours take care of that kind
of thing, which is why *once you have grasped the basics*
you should try them before rolling your own. What's really
interesting here is that the original system was written in
terms of threads and locking. Have you looked at thread
interfaces lately?
Unfortunately, I have worked a significant amount with pthreads, Java
threads, Windows threads, and OS/2 threads, and consider that skill set to
be in the same domain as C++: for experts (as opposed to the "average"
developer found in IT shops and who knows how many software companies) only.
The fact that we are now seeing multicore processors and are being exhorted
to write more thread-bearing code makes me shudder. I suspect that most of
the readers of this newsgroup are highly skilled developers, hopefully
working for great companies, and I wonder if they have been exposed to the
horrors of the "average" developer (as evidenced in "The Daily WTF" and
similar web sites). I hope I am not coming across conceited; I'm not like
that at all. It is just that I have spent 25 years working hard to be as
good a software guy (for want of a better description) as possible, and for
my troubles have almost always been put in a position where I have had to
teach and mentor the "average" developer (and fix literally millions of
lines of code, usualy C++). My experience (and others I am sure will have
differences of opinion) is that the level of the average developer is
trending steeply downwards as developers become a globalized commodity
("resource") and compete more on price than on ability. These days, I find
it rare to see halfway decent code written in ANY language, and the thought
of zillions of lines of thread-bearing C++ and Java software being let loose
on the world by people who can't even spell O(N^2) scares the heck out of
me. Sorry if I am ranting a bit.
Post by unknown
[Single Unix Specification version 3]
pthread_mutex_lock(&mutex)
pthread_mutex_unlock(&mutex)
PTHREAD_MUTEX_NORMAL
recursive locking deadlocks
unlocking a mutex that is already unlocked
or that is locked by another => undefined
PTHREAD_MUTEX_ERRORCHECK
these conditions return an error code
PTHREAD_MUTEX_RECURSIVE
recursive locking works
unlocking a mutex that is already unlocked
or that is locked by another => error code
PTHREAD_MUTEX_DEFAULT
same as PTHREAD_MUTEX_NORMAL
Anything else
All behaviour undefined
[Solaris 2.10]
If
(1) _POSIX_THREAD_PRIO_INHERIT is defined
(2) the mutex was initialised with protocol
attribute PTHREAD_PRIO_INHERIT
(3) the mutex was initialised with robustness
attribute PTHREAD_MUTEX_ROBUST_NP
(4) the last holder of the lock crashed
then
(A) an attempt to lock the mutex will 'fail'
with the error code EOWNERDEAD
(B) but in fact the attempt will have succeeded
(C) it is up to the *new* owner of the lock to
try to clean up the state
(D) if it can, it calls pthread_mutex_consistent_np
(E) if it crashes, the next locker will get the
same error code and the same chance to recover
(F) if it can't, it should unlock the mutex, and
future lockers will get a *different* error code
(ENOTRECOVERABLE).
(G) it is possible to call pthread_mutex_consistent_np
on mutexes that aren't held or didn't need
recovery and the behaviour is undefined
If a mutex with the default PTHREAD_MUTEX_STALLED_NP
robustness value is held by a thread that dies,
future locks are "blocked in an unspecified manner".
What this means in practice I'm not sure.
If you reckon the Solaris 2.10 treatment of crashed
lock holders is a mess, perhaps you can point me to
something better in SUSv3. I can't find anything in
SUSv3 to say _what_ happens when a lock owner crashes.
(And don't get me onto the subject of cancellation points.)
What kind of building block is _this_ for building
reliable systems?
One made of clay and straw and baked in the sun.
Post by unknown
Message passing plus linking is so much easier to
have justified confidence in that pthreads and TBB start
to look like extremely sick jokes.
I couldn't agree with you more. One thing that amazed me about Erlang is
how, once I had figured out how to write something, it Just Worked (TM). I
fell in love.
Post by unknown
<off-topic>One last thing: I read the Ethics of Belief after poring over
Post by unknown
one of your posts recently, and was exceptionally impressed with the
gentleman's writing and philosophy, with which I strongly agree. In that
regard, I'd like to misquote John Stuart Mill, namely, "A foolish certainty
is the hobgoblin of little minds". (Actually, I think my version is a slight
improvement ;) I am unfortunately seeing the mechanism of insufficiently
examined beliefs at work today, resulting in the persecution of a friend of
mine by way of (the almost totally belief-based) Shaken Baby Syndrome. So
the essay really resonated deeply with me. I daresay William K. Clifford
would have had a lot to say about this. I wish he were still alive to do so.
Bertrand Russel too.</offtopic>
<also-off-topic>Clifford takes a really good idea and pushes it beyond the
bounds
of reason; he manages, presumably without intending to, to make any belief
in
science ethically unjustifiable. Specifically and in Clifford's day
topically,
Clifford's rule about believing other people meant that it would have been
*bad*
for anyone to believe in Natural Selection. I encountered Clifford's paper
in
a book by DeMarco and someone else about risk management in software
engineering.
I find the idea that Shaken Baby is still credited deeply upsetting; please
convey
my sympathy and good wishes to your friend.
...
read more ?
_______________________________________________
erlang-questions mailing list
erlang-questi...://www.erlang.org/mailman/listinfo/erlang-questions
unknown
2008-06-29 11:16:34 UTC
Permalink
Hi Mike,

To go off on a slight tangent with the hope of understanding your problem
better, do you have to get the data from the other collections with them
locked? Can you just ask for their current state and use whatever they have
at that point or are they somehow dependent on each other?

Tom
Post by unknown
Hello,
I was not sure where best to post this and hope someone is able to
help with this question.
After being bitten by the pitfalls of lock-oriented multi threading I
am interested in switching to message passing oriented concurrency.
I have read that erlang has a per-process ordering guarantee (that is
if A sends messages 1 and 2 to B 1 will arrive before 2 at B. However,
there is no guarantee that messages from C and D will not be placed in
between 1 and 2.
In my current lock oriented program design I have threads dedicated to
managing different collections and actions on those collections
(hashmaps), lets call them A, B, C, and D. At several points in the
program one thread needs to get data from the other collections and
make a decision based on the collective state of the values it gathers
form those other collections.
An example situation would be A needing data from B, C, and D. In that
case A would attempt to lock A, B, C, and D then gather the items it
needs and release the lock.
How would I accomplish this same task in a message passing manner?
I had though of doing: A sends a message to B, C, and D asking for the
data. However, B, C, and D may have each changed independently of each
other by the time they receive the request for data and/or by the time
they are able to send the message to A. So, how do you deal with
situations like this in a message passing paradigm?
_______________________________________________
erlang-questions mailing list
erlang-questions
http://www.erlang.org/mailman/listinfo/erlang-questions
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20080629/93db61d1/attachment.html>
unknown
2008-06-30 08:49:32 UTC
Permalink
Change your design from pull to push. Make servers which collect data and do
something instead "request" data or actions.
Post by unknown
Hello,
I was not sure where best to post this and hope someone is able to
help with this question.
After being bitten by the pitfalls of lock-oriented multi threading I
am interested in switching to message passing oriented concurrency.
I have read that erlang has a per-process ordering guarantee (that is
if A sends messages 1 and 2 to B 1 will arrive before 2 at B. However,
there is no guarantee that messages from C and D will not be placed in
between 1 and 2.
In my current lock oriented program design I have threads dedicated to
managing different collections and actions on those collections
(hashmaps), lets call them A, B, C, and D. At several points in the
program one thread needs to get data from the other collections and
make a decision based on the collective state of the values it gathers
form those other collections.
An example situation would be A needing data from B, C, and D. In that
case A would attempt to lock A, B, C, and D then gather the items it
needs and release the lock.
How would I accomplish this same task in a message passing manner?
I had though of doing: A sends a message to B, C, and D asking for the
data. However, B, C, and D may have each changed independently of each
other by the time they receive the request for data and/or by the time
they are able to send the message to A. So, how do you deal with
situations like this in a message passing paradigm?
_______________________________________________
erlang-questions mailing list
erlang-questions
http://www.erlang.org/mailman/listinfo/erlang-questions
--
--Hynek (Pichi) Vychodil
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20080630/90eaef54/attachment.html>
unknown
2008-07-01 03:37:18 UTC
Permalink
This post might be inappropriate. Click to display it.
unknown
2008-06-30 13:14:23 UTC
Permalink
Hi Mike,

some of the more experienced erlangers probably have specific
techniques to help with this, but my general take on it (coming from a
mixed background that includes java/c++ as well as erlang) is to
reconsider this design. I'd move away from the pull style you're using
here, which requires all this synchronization code (although it's much
easier to understand than comparable thread safe java code), and go
for a push style that (a) doesn't involve all this mutable state and
(b) puts the sync code all in one place.
Post by unknown
After being bitten by the pitfalls of lock-oriented multi threading I
am interested in switching to message passing oriented concurrency.
Indeed, but Erlang's message passing style isn't the only thing about
it that makes concurrent programming easier! The fact that Erlang
follows the functional programming paradigm is also a major
improvement over imperative languages like Java. The key thing here is
that you don't have mutable state to worry about. I'd question why you
need to implement these state containers in the first place - perhaps
there's a more purely functional way of looking at the problem at hand
and a finer abstraction to help deal with it!? I'm sure that posting
some more details might entice people to have a more concrete and
interesting discussion!

Now I'll concede that some Erlang process do encapsulate state that
changes over time (in response to incoming messages) and using your
example of pessimistic concurrency, it's true that in order to
maintain integrity across the 3 "container processes" you'd need some
kind of locking protocol. This can be established very easily by using
a selective receive on B, C and D, to ensure that a request for a
resource with id=X within the process, is locked until the requesting
process (A) releases it. You might, for example, set up a lock list in
your loop, for example:

loop(State, LockList) ->
%% ... some interesting code...
receive
{ SenderPid, acquire, ItemId } ->
{ acquired, Response } = get_item(ItemId, LockList),
SenderPid ! { { acquired, ok }, Response },
loop(State, [ ItemId | LockList ])
; { SenderPid, read, ItemId } ->
SenderPid ! { { read, ok }, get_item_readonly(ItemId) },
loop(State, LockList)
; { SenderPid, write, { ItemId, Value } } ->
{ acquired, Response } = get_item(ItemId, LockList),
Write = write_item(ItemId, Value),
SenderPid ! { { write, ok }, Write }
end.

Ok so it's a very quick and contrived example, but you get the point.
The read should have better concurrent behavior as we only need to
check locks for writes or write locks (acquire) and in this simple
example, I used pattern matches to make sure that failed acquire
and/or write message will cause the process to die. I'm not sure
that's how I'd like to do it in a production system, but it serves as
a very simple example. It might be that you're better off with proper
transaction semantics (such as 2 phase commit), which are quite easy
to write in the message passing style using patterns such as
leader-follower to deal with leader election and transaction voting,
etc.There's lots of examples of this on the web.

Cheers,

Tim Watson
unknown
2008-06-30 13:16:36 UTC
Permalink
Whoops - think I posted this without a RE: in the subject line - sorry! :(

---------- Forwarded message ----------
From: Tim Watson <watson.timothy>
Date: 2008/6/30
Subject: [erlang-questions] Question about message passing paradigm
To: erlang-questions

Hi Mike,

some of the more experienced erlangers probably have specific
techniques to help with this, but my general take on it (coming from a
mixed background that includes java/c++ as well as erlang) is to
reconsider this design. I'd move away from the pull style you're using
here, which requires all this synchronization code (although it's much
easier to understand than comparable thread safe java code), and go
for a push style that (a) doesn't involve all this mutable state and
(b) puts the sync code all in one place.
Post by unknown
After being bitten by the pitfalls of lock-oriented multi threading I
am interested in switching to message passing oriented concurrency.
Indeed, but Erlang's message passing style isn't the only thing about
it that makes concurrent programming easier! The fact that Erlang
follows the functional programming paradigm is also a major
improvement over imperative languages like Java. The key thing here is
that you don't have mutable state to worry about. I'd question why you
need to implement these state containers in the first place - perhaps
there's a more purely functional way of looking at the problem at hand
and a finer abstraction to help deal with it!? I'm sure that posting
some more details might entice people to have a more concrete and
interesting discussion!

Now I'll concede that some Erlang process do encapsulate state that
changes over time (in response to incoming messages) and using your
example of pessimistic concurrency, it's true that in order to
maintain integrity across the 3 "container processes" you'd need some
kind of locking protocol. This can be established very easily by using
a selective receive on B, C and D, to ensure that a request for a
resource with id=X within the process, is locked until the requesting
process (A) releases it. You might, for example, set up a lock list in
your loop, for example:

loop(State, LockList) ->
%% ... some interesting code...
receive
{ SenderPid, acquire, ItemId } ->
{ acquired, Response } = get_item(ItemId, LockList),
SenderPid ! { { acquired, ok }, Response },
loop(State, [ ItemId | LockList ])
; { SenderPid, read, ItemId } ->
SenderPid ! { { read, ok }, get_item_readonly(ItemId) },
loop(State, LockList)
; { SenderPid, write, { ItemId, Value } } ->
{ acquired, Response } = get_item(ItemId, LockList),
Write = write_item(ItemId, Value),
SenderPid ! { { write, ok }, Write }
end.

Ok so it's a very quick and contrived example, but you get the point.
The read should have better concurrent behavior as we only need to
check locks for writes or write locks (acquire) and in this simple
example, I used pattern matches to make sure that failed acquire
and/or write message will cause the process to die. I'm not sure
that's how I'd like to do it in a production system, but it serves as
a very simple example. It might be that you're better off with proper
transaction semantics (such as 2 phase commit), which are quite easy
to write in the message passing style using patterns such as
leader-follower to deal with leader election and transaction voting,
etc.There's lots of examples of this on the web.

Cheers,

Tim Watson
unknown
2008-07-01 07:06:43 UTC
Permalink
Hi Edwin,

I'm quite fresh around here too and I agree, someone definitely needs
to write up some common patterns for the paradigm. I also suggested a
locking solution using selective receive, but being a little tired, I
managed to post it with the wrong subject line. Twice. <sigh>

Still, it wasn't really a solution, but a discussion about the
applicability of selective receive - I can't see another way of doing
this. My concern with the approach is that there is no way for a
process to "safely lock itself" unless it can trap exists from the
process trying to "acquire the lock," as it were. If A starts
communicating with B and asks B to stay in a particular state until it
[A] is ready, then A had better broadcast it's death if anything goes
wrong so that B can "unlock itself," for want of a better description.
The more processes are involved, the more interesting that gets. My
original solution was to lock each container process against a
specific value (key) and continue to serve reads while blocking
writes. My solution doesn't however, deal with process failures very
well! Here's a quick stab:

loop(State, LockList) ->
%% ... some interesting code...
receive
{ SenderPid, acquire, ItemId } ->
case get_item(ItemId, LockList) of
{ acquired, Response } -> SenderPid ! { { acquired, ok
}, Response }
; _ -> SenderPid ! error
end,
loop(State, [ ItemId | LockList ])
; { SenderPid, read, ItemId } ->
SenderPid ! { { read, ok }, get_item_readonly(ItemId) },
loop(State, LockList)
; { SenderPid, write, { ItemId, Value } } ->
{ acquired, Response } = get_item(ItemId, LockList),
Write = write_item(ItemId, Value),
SenderPid ! { { write, ok }, Write }
end.

Obviously I've imagined some code in get_item that checks the lock
list before a successful acquire, etc. This solution is non blocking
for reads, but in the event that you try and get a write (i.e,
acquire) a locked item, the process dies with a pattern matching
failure (based on the assumption that get_item will return an 'error'
atom or something like that, on failure). This is hardly a solution,
but I wanted to look at the selective receive bit. What happens if the
process who "acquired" item X dies and doesn't notify you!? X sort of
"goes missing,", which is hardly ideal. In your example, where the
individual processes actually block until released, a fatal error in
the lock owner will deadlock the system.

You could always check the sender pid against your linked nodes and
link to it if it's not already in there. Personally, I don't like this
approach though - it all feels like a kludge. I like the idea of
changing from a pull to a push design (mentioned earlier), as this
whole locking multiple processes feels horrible. Think about a typical
(Joe forgive me) OO implementation in java or some such. The number 1
rule is don't call into some other object while holding a lock, as
you've no idea what might happen and you might end up deadlocked. This
question raises the same questions for me - do I really want these
distributed locks cropping up all over the place? If data consistency
is really that important, then maybe we should find a better way to
accumulate the data we need to perform a given operation.

As another contrived example, I can't imagine that if in the process
of say, selling broadband, you need to fist check the network
inventory, then provision an engineer and finally set up a billing
record against a customer account, that you'd want to deal with the
locking protocol in this way. Say the network inventory and engineer
availability rosters are in different parts of the system, which you
need to call via some RPC (say they're not actual Erlang nodes, but
some java application that you hit over HTTP), or maybe just in a
database or whatever. I would've thought that the safest way to do
this with pessimistic locking is surely to lock the whole provisioning
activity (e.g., upon receiving the initial call) - this might entail
exposing the provisioning process as a gen_server and performing the
locking there (this might even happen implicitly, if you use the
synchronous 'call' function instead of the async 'cast') or something
like that. This is going to have a bad effect on your liveliness,
naturally, but that's life. A course grained lock is safer, but less
lively. A lock that doesn't deal with error conditions sounds bad
though.

Hopefully someone with a bit more experience will chime in here and
point out the obvious to us new folk!? ......

Cheers,

Tim Watson
------------------------------
Message: 7
Date: Mon, 30 Jun 2008 23:42:51 -0400
From: "Edwin Fine" <erlang-questions_efine>
Subject: Re: [erlang-questions] Question about message passing
paradigm
To: "Richard A. O'Keefe" <ok>
Cc: Mike T <talmage.news>, erlang-questions
<6c2563b20806302042q2605badie220bd1acb7d72b3>
Content-Type: text/plain; charset="utf-8"
Richard,
I'm new to Erlang and FP, and I want to learn from my mistakes or
misunderstandings. Please could you critique the suggestion I sent in
regarding this problem? The fact that nobody commented means either that it
was (a) totally naive and not worthy of comment (or to spare my feelings),
or (b) such a good idea that all were rendered speechless with admiration.
Somehow the probability of (b) seems rather low. So... where did I go wrong?
Regards,
Edwin FIne
On Mon, Jun 30, 2008 at 11:33 PM, Richard A. O'Keefe <ok>
Post by unknown
The problem we are discussing is
processes B, C, D hold information X, Y, Z respectively;
process A wants a coherent snapshot of X, Y,Z.
There are actually two slightly different cases depending on
A needs "X Y Z as of *now*" (A, B, C, and D must all
synchronise), or A needs "X Y Z as of *some* time in the
recent past" (B, C, and D must all synchronise but can then
send the information to A without waiting for A to receive it).
I like this problem because it is simple yet subtle.
One way that it is subtle is that in "multithread" programming
most people STILL think in terms of a single absolute time
shared by all threads. (After all, there _is_ such a thing,
the system clock. And yes, it's not exactly true, but it's
close enough to make people _think_ it's true.) But when you
start thinking about Erlang and especially *distributed*
Erlang, you start realising that "now" is a pretty fuzzy
concept.
To me, the easiest approach to *think* was
A sends "tell me X" to B, "tell me Y" to C,
and "tell me Z" to D in any order,
then does three receives in any order to
collect the results, then
sends "thanks" to B, C, and D in any order.
B receives "tell me X" from A, computes X,
sends "here is X" to A, then waits for
"thanks" from A.
C and D are just like B.
This has 9 messages compared, but it is symmetric, and it is
actually pretty close to the locking technique (to "lock" a
process, send it a message asking it to wait, receive its
acceptance, and then send it a message telling it to proceed).
This extends to any number of processes E, F, G, ... and none
of the processes except A has to know about any of the others.
On 30 Jun 2008, at 6:45 pm, Erik Reitsma proposed a scheme
that doesn't actually need any functions to be sent around
A sends "I need X Y and Z, ask C and D for Y and Z"
to B then waits.
B sends "A needs Y and Z, ask D for Z, here is X"
to C. B then waits for the go-ahead.
C sends "A needs Z, here are X and Y" to D,
then waits.
D sends "here are X Y and Z" to A.
If a four-way synchronisation is wanted, D waits.
If a three-way is wanted, it doesn't.
A receives X, Y, and Z. It sends "go ahead" to B
and C (and to D if four-way is wanted).
This is 7 messages if the "NOW" reading with four-way
synchronisation is wanted, or 6 if the "some time in the
recent past" reading with three-way synchronisation is
wanted. Since B, C, and D have to be told that their
information is needed and A has to receive the results,
four messages are needed.
Can it be done in fewer than 7 (or 6) messages?
Are there other readings of the requirements that I've missed?
How do/should we think about these problems?
Pace Dijkstra, I'm afraid that I came up with the schemes
who knows what?
who needs to know what?
can the message count be reduced if I ask
someone to do something for me?
plus a sort of intuitive idea that
to get N processes to synchronise, get all
but 1 of them waiting for something
There has to be a tutorial, perhaps even a textbook, waiting
to be written about "how to think about messages".
--
The great enemy of the truth is very often not the lie -- deliberate,
contrived and dishonest, but the myth, persistent, persuasive, and
unrealistic. Belief in myths allows the comfort of opinion without the
discomfort of thought.
John F. Kennedy 35th president of US 1961-1963 (1917 - 1963)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.erlang.org/pipermail/erlang-questions/attachments/20080630/5455b955/attachment.html
------------------------------
_______________________________________________
erlang-questions mailing list
erlang-questions
http://www.erlang.org/mailman/listinfo/erlang-questions
End of erlang-questions Digest, Vol 14, Issue 1
***********************************************
unknown
2008-07-01 09:17:52 UTC
Permalink
Hi Edwin,

I hadn't thought of using timeouts like this, but yes, it sounds like
a viable idea. I guess my concern(s) about liveliness still stand, but
Richards use of a last minute collection strategy seems to address
that quite well. I'd try to avoid this kind of locking protocol if at
all possible though, refactoring towards a push based design (as per
Richard and Hynek Vychodil's discussion elsewhere on this thread), but
where this is just the art of the possible, it seems good to me.

Cheers,

Tim
Well, Richard's response to my question (and the OP's question) was most
enlightening.
I believe that deadlocks can be dealt with quite easily by the judicious
choice of timeouts. For example, in the suggestions I posted involving the
use of a gen_fsm behavior, it would be relatively trivial to add a timeout
that reverts the state back to "unlocked" should the set of operations not
be complete within a reasonable period of time. Indeed, I think Joe had an
example of that in the book, or else the gen_fsm documentation does, showing
a gen_fsm used to hold the combination to a safe. After a period of time, if
digits are entered the safe is not unlocked, the state reverts to "locked"
again and it has to start from scratch.
Richard's idea of sending individual queries which don't allow the receiver
to handle any other queries until it gets an ACK could also be protected by
timeouts.
In this way, I don't think the broadcasting of deaths would be necessary: if
a process is not "unlocked" in a reasonable amount of time, it unlocks
itself. This covers a multitude of sins: process death, network death,
programmer death (well, maybe not :).
As Richard aptly pointed out, I tend to see solutions in a framework of
Erlang/OTP, not pure Erlang the language. So if using a gen_fsm behavior,
erlang:send_after() is perfect for this timeout strategy.
try
A = thing_a:get_value(),
B = thing_b:get_value(),
C = thing_c:get_value(),
{ok, {A, B, C}}
catch
Cls:Exc -> {error, {Cls, Exc}}
after
thing_c:release(),
thing_b:release(),
thing_a:release()
end.
Even if the node running the above code crashes, if timeouts exist in
thing_a, thing_b, and thing_c, they will unlock themselves regardless of if
release() is called. Of course, you could get a spurious old release()
coming in after a timeout occurred, messing up the next transaction, but you
could deal with that by using a common ref to correlate the three. If a
stale release came in with the wrong ref, it would be ignored.
I dunno. Too simplistic? Thoughts?
Edwin
Post by unknown
Hi Edwin,
I'm quite fresh around here too and I agree, someone definitely needs
to write up some common patterns for the paradigm. I also suggested a
locking solution using selective receive, but being a little tired, I
managed to post it with the wrong subject line. Twice. <sigh>
Still, it wasn't really a solution, but a discussion about the
applicability of selective receive - I can't see another way of doing
this. My concern with the approach is that there is no way for a
process to "safely lock itself" unless it can trap exists from the
process trying to "acquire the lock," as it were. If A starts
communicating with B and asks B to stay in a particular state until it
[A] is ready, then A had better broadcast it's death if anything goes
wrong so that B can "unlock itself," for want of a better description.
The more processes are involved, the more interesting that gets. My
original solution was to lock each container process against a
specific value (key) and continue to serve reads while blocking
writes. My solution doesn't however, deal with process failures very
loop(State, LockList) ->
%% ... some interesting code...
receive
{ SenderPid, acquire, ItemId } ->
case get_item(ItemId, LockList) of
{ acquired, Response } -> SenderPid ! { { acquired, ok
}, Response }
; _ -> SenderPid ! error
end,
loop(State, [ ItemId | LockList ])
; { SenderPid, read, ItemId } ->
SenderPid ! { { read, ok }, get_item_readonly(ItemId) },
loop(State, LockList)
; { SenderPid, write, { ItemId, Value } } ->
{ acquired, Response } = get_item(ItemId, LockList),
Write = write_item(ItemId, Value),
SenderPid ! { { write, ok }, Write }
end.
Obviously I've imagined some code in get_item that checks the lock
list before a successful acquire, etc. This solution is non blocking
for reads, but in the event that you try and get a write (i.e,
acquire) a locked item, the process dies with a pattern matching
failure (based on the assumption that get_item will return an 'error'
atom or something like that, on failure). This is hardly a solution,
but I wanted to look at the selective receive bit. What happens if the
process who "acquired" item X dies and doesn't notify you!? X sort of
"goes missing,", which is hardly ideal. In your example, where the
individual processes actually block until released, a fatal error in
the lock owner will deadlock the system.
You could always check the sender pid against your linked nodes and
link to it if it's not already in there. Personally, I don't like this
approach though - it all feels like a kludge. I like the idea of
changing from a pull to a push design (mentioned earlier), as this
whole locking multiple processes feels horrible. Think about a typical
(Joe forgive me) OO implementation in java or some such. The number 1
rule is don't call into some other object while holding a lock, as
you've no idea what might happen and you might end up deadlocked. This
question raises the same questions for me - do I really want these
distributed locks cropping up all over the place? If data consistency
is really that important, then maybe we should find a better way to
accumulate the data we need to perform a given operation.
As another contrived example, I can't imagine that if in the process
of say, selling broadband, you need to fist check the network
inventory, then provision an engineer and finally set up a billing
record against a customer account, that you'd want to deal with the
locking protocol in this way. Say the network inventory and engineer
availability rosters are in different parts of the system, which you
need to call via some RPC (say they're not actual Erlang nodes, but
some java application that you hit over HTTP), or maybe just in a
database or whatever. I would've thought that the safest way to do
this with pessimistic locking is surely to lock the whole provisioning
activity (e.g., upon receiving the initial call) - this might entail
exposing the provisioning process as a gen_server and performing the
locking there (this might even happen implicitly, if you use the
synchronous 'call' function instead of the async 'cast') or something
like that. This is going to have a bad effect on your liveliness,
naturally, but that's life. A course grained lock is safer, but less
lively. A lock that doesn't deal with error conditions sounds bad
though.
Hopefully someone with a bit more experience will chime in here and
point out the obvious to us new folk!? ......
Cheers,
Tim Watson
------------------------------
Message: 7
Date: Mon, 30 Jun 2008 23:42:51 -0400
From: "Edwin Fine" <erlang-questions_efine>
Subject: Re: [erlang-questions] Question about message passing
paradigm
To: "Richard A. O'Keefe" <ok>
Cc: Mike T <talmage.news>, erlang-questions
<6c2563b20806302042q2605badie220bd1acb7d72b3>
Content-Type: text/plain; charset="utf-8"
Richard,
I'm new to Erlang and FP, and I want to learn from my mistakes or
misunderstandings. Please could you critique the suggestion I sent in
regarding this problem? The fact that nobody commented means either that it
was (a) totally naive and not worthy of comment (or to spare my feelings),
or (b) such a good idea that all were rendered speechless with admiration.
Somehow the probability of (b) seems rather low. So... where did I go wrong?
Regards,
Edwin FIne
On Mon, Jun 30, 2008 at 11:33 PM, Richard A. O'Keefe <ok>
Post by unknown
The problem we are discussing is
processes B, C, D hold information X, Y, Z respectively;
process A wants a coherent snapshot of X, Y,Z.
There are actually two slightly different cases depending on
A needs "X Y Z as of *now*" (A, B, C, and D must all
synchronise), or A needs "X Y Z as of *some* time in the
recent past" (B, C, and D must all synchronise but can then
send the information to A without waiting for A to receive it).
I like this problem because it is simple yet subtle.
One way that it is subtle is that in "multithread" programming
most people STILL think in terms of a single absolute time
shared by all threads. (After all, there _is_ such a thing,
the system clock. And yes, it's not exactly true, but it's
close enough to make people _think_ it's true.) But when you
start thinking about Erlang and especially *distributed*
Erlang, you start realising that "now" is a pretty fuzzy
concept.
To me, the easiest approach to *think* was
A sends "tell me X" to B, "tell me Y" to C,
and "tell me Z" to D in any order,
then does three receives in any order to
collect the results, then
sends "thanks" to B, C, and D in any order.
B receives "tell me X" from A, computes X,
sends "here is X" to A, then waits for
"thanks" from A.
C and D are just like B.
This has 9 messages compared, but it is symmetric, and it is
actually pretty close to the locking technique (to "lock" a
process, send it a message asking it to wait, receive its
acceptance, and then send it a message telling it to proceed).
This extends to any number of processes E, F, G, ... and none
of the processes except A has to know about any of the others.
On 30 Jun 2008, at 6:45 pm, Erik Reitsma proposed a scheme
that doesn't actually need any functions to be sent around
A sends "I need X Y and Z, ask C and D for Y and Z"
to B then waits.
B sends "A needs Y and Z, ask D for Z, here is X"
to C. B then waits for the go-ahead.
C sends "A needs Z, here are X and Y" to D,
then waits.
D sends "here are X Y and Z" to A.
If a four-way synchronisation is wanted, D waits.
If a three-way is wanted, it doesn't.
A receives X, Y, and Z. It sends "go ahead" to B
and C (and to D if four-way is wanted).
This is 7 messages if the "NOW" reading with four-way
synchronisation is wanted, or 6 if the "some time in the
recent past" reading with three-way synchronisation is
wanted. Since B, C, and D have to be told that their
information is needed and A has to receive the results,
four messages are needed.
Can it be done in fewer than 7 (or 6) messages?
Are there other readings of the requirements that I've missed?
How do/should we think about these problems?
Pace Dijkstra, I'm afraid that I came up with the schemes
who knows what?
who needs to know what?
can the message count be reduced if I ask
someone to do something for me?
plus a sort of intuitive idea that
to get N processes to synchronise, get all
but 1 of them waiting for something
There has to be a tutorial, perhaps even a textbook, waiting
to be written about "how to think about messages".
--
The great enemy of the truth is very often not the lie -- deliberate,
contrived and dishonest, but the myth, persistent, persuasive, and
unrealistic. Belief in myths allows the comfort of opinion without the
discomfort of thought.
John F. Kennedy 35th president of US 1961-1963 (1917 - 1963)
-------------- next part --------------
An HTML attachment was scrubbed...
http://www.erlang.org/pipermail/erlang-questions/attachments/20080630/5455b955/attachment.html
------------------------------
_______________________________________________
erlang-questions mailing list
erlang-questions
http://www.erlang.org/mailman/listinfo/erlang-questions
End of erlang-questions Digest, Vol 14, Issue 1
***********************************************
--
The great enemy of the truth is very often not the lie -- deliberate,
contrived and dishonest, but the myth, persistent, persuasive, and
unrealistic. Belief in myths allows the comfort of opinion without the
discomfort of thought.
John F. Kennedy 35th president of US 1961-1963 (1917 - 1963)
unknown
2008-07-01 15:03:54 UTC
Permalink
Responding to a thread on the erlang-questions list...

Begin quote=================================
The problem we are discussing is
processes B, C, D hold information X, Y, Z respectively;
process A wants a coherent snapshot of X, Y,Z.

There are actually two slightly different cases depending on
A needs "X Y Z as of *now*" (A, B, C, and D must all
synchronise), or A needs "X Y Z as of *some* time in the
recent past" (B, C, and D must all synchronise but can then
send the information to A without waiting for A to receive it).

I like this problem because it is simple yet subtle.
One way that it is subtle is that in "multithread" programming
most people STILL think in terms of a single absolute time
shared by all threads. (After all, there _is_ such a thing,
the system clock. And yes, it's not exactly true, but it's
close enough to make people _think_ it's true.) But when you
start thinking about Erlang and especially *distributed*
Erlang, you start realising that "now" is a pretty fuzzy
concept.
End quote===================================

Yes, the problem seems simple yet subtle. The downside is
there are many unwritten constraints (or not) on any specific
problem that could lead the solution alternatives one way or
another. Unless you want to really dig into those, then the
cost/benefit of one solution or another could be more or less
off.

e.g. why not coordinate through an in-memory database?
This could be reasonable, or not. We don't know enough.

Why not schedule the source processes to send a message
on a periodic or scheduled basis? This could be reasonable,
or not, and cut down the message traffic, which seemed to be
a concern.

Why is sending fewer than N messages a concern? Why
does one process have to collect the information? How
much information? How tight is the deadline? Is "now" an
actual timestamp or just some unknown point in time that
a request has been received? How close
to "now" do the other "nows" have to be with respect to each
other? Can you widen that window if it would decrease the
effort to build?

If synchronization across the processes is needed then
is an "eventually consistent" approach reasonable if it
lowers the effort to build?

Interesting stuff, but challenging to talk about in when the
details are too abstract.
unknown
2008-07-05 15:59:49 UTC
Permalink
Post by unknown
There are actually two slightly different cases depending on
A needs "X Y Z as of *now*" (A, B, C, and D must all
synchronise), or A needs "X Y Z as of *some* time in the
recent past" (B, C, and D must all synchronise but can then
send the information to A without waiting for A to receive it).
As you discovered, the only way to feel like you can solve this
as described is to cause all of the processes involved to block
until all processing is completed.

This, of course, leads to all sorts of problems if any of the
processes runs into trouble partway through.

In a more truly concurrent system, there is really no useful
meaning for "now" in a global sense. When a principal sends
you a message, all you really get to know is that the message
was sent before you received it. You can build up more useful
details than that, but you can't ever get all the way to "now"
without removing concurrency -- e.g. by making everyone block.

-Justin

Continue reading on narkive:
Loading...