2014-11-20

Published on 2014-11-20


Godson-T Murphi Modeling (3)

Problem: Pseudo-Protection

There are still counter examples even if we use one lock:

A node i required a lock to enter its critical region, and before it left, another node j wrote the memory in i's critical region. Then caches in i were not coherent.

The reason of this incoherence is that in our model, the locks do not protect the critical regions in reality.

Since the pseudo-protection of locks lead to the incoherence, we must re-write our murphi code:

  1. Add an array attribute inProtection of boolean to each lock, so that we can record what actually the critical region protected by this lock is.
  2. Before each no-lock-write, we check if the address is in protection of a lock.
  3. Before each locked write of a node i, we assert that if the address is in protection of lock l, then l must be the lock that i holds.
  4. After each operation in critical region, we add the protection record in the new attribute inProtection.
  5. When a node releases it lock, we remove all protection by the lock.

Problem: Confusing Replacement

Even if we protect the critical regions in reality, there are still some errors. Suppose there is only one cache in each node (so that fasten the generation of corresponding counterexamples):

There existed some nodes such as j that hold a dirty copy of address a. Then a node i required a lock, and read a as a first read, then j wrote memory in another address, which forced the dirty cache of address a to be written back. Then, cache in i was inherent.

This error caused by the Confusing Replacement strategy we use in our model. In fact, there are two diffirent situation which caused a "replacement".

  • If a locked first read occurs, we need write back ALL dirty caches in ALL nodes first, then act the read.
  • In other situations, only when a dirty cache need to be reused, we write back it no matter what state the other caches is in.

In our current model, we confuse the former situation with the latter, which brings an error. Now we need modify our replacement procedure.

Replacement Procedure

As the figure illustrated, we add two states in our replacement procedure: REQREPALL and REPALLDONE. Now the states or stages in replacement procedure means:

  • NON: do not need Replacement
  • REQUIRE: require to replace
  • REQREPALL: in Locked First Read, need replace all dirty caches in all nodes
  • RANDOM: if there does not exist a INVALID cache, then choose a random one
  • RANDINV: if there exists at least a INVALID cache, then choose a random INVALID one
  • DESIGNATED: the cache to be replaced has been designated
  • TOREP: after DESIGNATED, if the designated cache is DIRTY, then replace it
  • DONE: the replacement has been done
  • REPALLDONE: the REQREPALL has been done

That is, if we meet a locked 1st read, we first replace all dirty caches in all nodes, then resume to original steps. In this approach, not only can we use nested locks in our model, but also the model can hold the strong Coherence Property without weakening it:

The cached data must be the same with data in their associated memories. In Godson-T, this means Read operation in critical region must cache correct data.

Now our modeling work has been done well!

About GANF

Guard-Assignment Normal Form (GANF) is the principle that we use while modeling in murphi. It requires that

  • Never use if clause in the assignment part of a rule.
  • The indexes or ranks or parameters of array variables must be simple and constant.
    • Simple: the indexes mustn't be nested variables, e.i., are not reference form complex data structures.
    • Constant: the indexes must be either parameter of rulesets or constants, e.i., it's not a common variable.
  • No procedures and functions.
  • The rule names are better to be identifiers.

Association Analysis

Definition

  • Items: Let I = {i1, ..., in} be a set of n binary attributes. Items represent goods in a bascket bought by a custom.
  • Item set: The set of 1-valued items in I.
  • Transactions: Let D = {t1, ..., tm} be a set of transactions which represent the bascket of a custom. A set of Transactions is also called a Database. Each transaction in D has a unique transaction ID and contains a itemset.
  • Support of an Itemset: The support σ(X) of an itemset X is defined as the proprtion of transactions in D which contain X. A transaction t support X if X is subset of itemset in t. And σ(X) is the proportion of transactions which suppurt X.
  • Frequent Itemset: A itemset whose support is greater than a minimum support as threshold.
  • Rule: A rule is defined as an implication of the form XY where X, Y are itemsets and XY = Ø.
  • Support of a Rule: The support of a rule is defined as s(XY) = σ(X U Y)/N, where N is count of items I.
  • Confidence: The confidence of a rule is defined as c(XY) = σ(X U Y)/σ(X).

Process

Association rules are usually required to satisfy a user-specified minimum support and a user-specified minimum confidence at the same time. Association rule generation is usually split up into two separate steps:

  1. First, minimum support is applied to find all frequent itemsets in a database, e.i., find all frequent itemsets.
  2. Second, these frequent itemsets and the minimum confidence constraint are used to form rules called Strong Rules.

While the second step is straightforward, the first step needs more attention. For more see Apriori Algorithm.