<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	>
<channel>
	<title>Comments on: The Achilles heel of the CAP theorem</title>
	<atom:link href="http://blog.atomikos.com/2008/09/the-achilles-heel-of-the-cap-theorem/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.atomikos.com/2008/09/the-achilles-heel-of-the-cap-theorem/</link>
	<description>The Atomikos Blog</description>
	<pubDate>Wed, 22 May 2013 21:59:11 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.7</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: barry</title>
		<link>http://blog.atomikos.com/2008/09/the-achilles-heel-of-the-cap-theorem/comment-page-1/#comment-27</link>
		<dc:creator>barry</dc:creator>
		<pubDate>Sat, 28 Mar 2009 21:45:54 +0000</pubDate>
		<guid isPermaLink="false">http://atomikos.com/blog/?p=32#comment-27</guid>
		<description>» Only process requests when there is no partition problem.

Doesn't this mean that you are sacrificing availability? You've turned a failure in partitioning into a failure in availability. While the answers and responses are queued so no request or response is lost, that doesn't mean all is well. A response may take a long time to come back which is as much of a problem as getting an error.</description>
		<content:encoded><![CDATA[<p>» Only process requests when there is no partition problem.</p>
<p>Doesn&#8217;t this mean that you are sacrificing availability? You&#8217;ve turned a failure in partitioning into a failure in availability. While the answers and responses are queued so no request or response is lost, that doesn&#8217;t mean all is well. A response may take a long time to come back which is as much of a problem as getting an error.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Guy</title>
		<link>http://blog.atomikos.com/2008/09/the-achilles-heel-of-the-cap-theorem/comment-page-1/#comment-12</link>
		<dc:creator>Guy</dc:creator>
		<pubDate>Tue, 20 Jan 2009 20:30:00 +0000</pubDate>
		<guid isPermaLink="false">http://atomikos.com/blog/?p=32#comment-12</guid>
		<description>Hi Dan,&lt;br/&gt;&lt;br/&gt;Sure have I read "Impossibility of distributed consensus with one faulty process" - it is at the basis of the heuristic exceptions in all two-phase commit solutions (including Atomikos).&lt;br/&gt;&lt;br/&gt;However, what I am saying is that the failure usually only lasts for so long, and afterward things can move on. Exploiting the right tools to do that can help availability.&lt;br/&gt;&lt;br/&gt;That is the main advantages of (persistent) queues and that is all I am saying. Lynch et al do not seem to exploit it as much as they could...&lt;br/&gt;&lt;br/&gt;Guy</description>
		<content:encoded><![CDATA[<p>Hi Dan,</p>
<p>Sure have I read &#8220;Impossibility of distributed consensus with one faulty process&#8221; - it is at the basis of the heuristic exceptions in all two-phase commit solutions (including Atomikos).</p>
<p>However, what I am saying is that the failure usually only lasts for so long, and afterward things can move on. Exploiting the right tools to do that can help availability.</p>
<p>That is the main advantages of (persistent) queues and that is all I am saying. Lynch et al do not seem to exploit it as much as they could&#8230;</p>
<p>Guy</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: PetrolHead</title>
		<link>http://blog.atomikos.com/2008/09/the-achilles-heel-of-the-cap-theorem/comment-page-1/#comment-11</link>
		<dc:creator>PetrolHead</dc:creator>
		<pubDate>Tue, 20 Jan 2009 20:11:00 +0000</pubDate>
		<guid isPermaLink="false">http://atomikos.com/blog/?p=32#comment-11</guid>
		<description>Have you read:&lt;br/&gt;&lt;br/&gt;http://portal.acm.org/citation.cfm?id=214121&lt;br/&gt;&lt;br/&gt;"Impossibility of distributed consensus with one faulty process"?&lt;br/&gt;&lt;br/&gt;This is an important result and has significance to your comments and the CAP theorem.  Essentially one can't tell the difference between a genuine failure and a slow running machine or busy network.&lt;br/&gt;&lt;br/&gt;Thus your solution might work for a very small number of machines all in a single data-centre but for larger installations, failure of machines, routers, switches, cables etc will happen several times a day and thus quorums and clusters become considerably less practical and loose consistency more attractive.&lt;br/&gt;&lt;br/&gt;Note also that the theorem isn't just about clustered services in the traditional sense but also services that run across multiple data-centres.&lt;br/&gt;&lt;br/&gt;I also have a specific observation:&lt;br/&gt;&lt;br/&gt;"....note that quorum solutions exist to avoid that the complete cluster has to be up at the same time."&lt;br/&gt;&lt;br/&gt;This is true but they are limited by a number of factors practically:&lt;br/&gt;&lt;br/&gt;(1)  The assumption that you will have a majority - seemingly this is straightforward but a partition plus a loss of a machine can leave you without a majority.&lt;br/&gt;&lt;br/&gt;(2)  Getting all members back into sync.  Can require all sorts of special admin involvement and it can go wrong.&lt;br/&gt;&lt;br/&gt;(3)  Performance - quorum protocols especially across enough nodes to ensure survival can be slow.&lt;br/&gt;&lt;br/&gt;(4)  Ensuring that clients don't continue to make use of the minority during a partition e.g. reporting out-of-date information.&lt;br/&gt;&lt;br/&gt;(5)  You can have a cluster capable of achieving consensus but you can't reach it because the network is broken between cluster and clients.&lt;br/&gt;&lt;br/&gt;Best,&lt;br/&gt;&lt;br/&gt;Dan.&lt;br/&gt;http://www.dancres.org/blitzblog</description>
		<content:encoded><![CDATA[<p>Have you read:</p>
<p><a href="http://portal.acm.org/citation.cfm?id=214121" rel="nofollow">http://portal.acm.org/citation.cfm?id=214121</a></p>
<p>&#8220;Impossibility of distributed consensus with one faulty process&#8221;?</p>
<p>This is an important result and has significance to your comments and the CAP theorem.  Essentially one can&#8217;t tell the difference between a genuine failure and a slow running machine or busy network.</p>
<p>Thus your solution might work for a very small number of machines all in a single data-centre but for larger installations, failure of machines, routers, switches, cables etc will happen several times a day and thus quorums and clusters become considerably less practical and loose consistency more attractive.</p>
<p>Note also that the theorem isn&#8217;t just about clustered services in the traditional sense but also services that run across multiple data-centres.</p>
<p>I also have a specific observation:</p>
<p>&#8220;&#8230;.note that quorum solutions exist to avoid that the complete cluster has to be up at the same time.&#8221;</p>
<p>This is true but they are limited by a number of factors practically:</p>
<p>(1)  The assumption that you will have a majority - seemingly this is straightforward but a partition plus a loss of a machine can leave you without a majority.</p>
<p>(2)  Getting all members back into sync.  Can require all sorts of special admin involvement and it can go wrong.</p>
<p>(3)  Performance - quorum protocols especially across enough nodes to ensure survival can be slow.</p>
<p>(4)  Ensuring that clients don&#8217;t continue to make use of the minority during a partition e.g. reporting out-of-date information.</p>
<p>(5)  You can have a cluster capable of achieving consensus but you can&#8217;t reach it because the network is broken between cluster and clients.</p>
<p>Best,</p>
<p>Dan.<br /><a href="http://www.dancres.org/blitzblog" rel="nofollow">http://www.dancres.org/blitzblog</a></p>
]]></content:encoded>
	</item>
</channel>
</rss>
