2014-11-18Published on 2014-11-18
Godson-T murphi-modeling (1)
Godson-T is a many-core architecture designed and implemented by Institite of Computing Technology, Chinese Academy of Sciences. And Godson-T cache coherence protocol (Godson-T protocol for short) is the cache coherence protocol used in Godson-T system.
Model in murphi
- There exist 2 caches in each node, and 2 memory address. In fact, this is not a big problem because murphi is a parameterized modeling language.
- The caches will take initiative in replacement while need, e.i., we should define related rules. In fact, this brings easiness for modeling. However, the replacement procedure is still the hardest problem in modeling.
- Locks could be nested, e.i., many locks will be acquired at the same time, while one node can only hold one lock.
Approach to Replacement
Modeling replacement procedure itself is not a hard work, but for further use in our research tools and experiments, indexes or ranks or parameters of arrays are better to be integer constants or simple integer variables rather than members of complex data structures. And this task is not that easy.
So what is the replacement procedure? It is the situation that we need designate a cache in the node. If there exist some INVALID caches, we randomly choose one (in practise, we should obey some principles such as LRU); else if there does not exist any INVALID chache, we still randomly choose one, but if it is DIRTY, then we Replace it into the associated memory address and then mark it as INVALID. We implement this procedure with a series of rules, and before these rules, we need store the current state and after these rules, we need return to the stored state. Thus, a rule that needs replacement will be seperated into 2 rules.
I designed two different methods for replacement procedure in modeling Godson-T.
Replacement Procedure #1
The problem in this procedure is that, in LRU, the nodes will choose proper caches without concerning about if its state is VALID or DIRTY, but this procedure concerns.
Replacement Procedure #2
This procedure is better because it's nearer to the protocol and it has fewer states. The states in it are described following in detail. They are stages of that caches take initiative in replacement.
- NON: do not need Replacement
- REQUIRE: require to replace
- RANDOM: if there does not exist a INVALID cache, then choose a random one
- RANDINV: if there exists at least a INVALID cache, then choose a random INVALID one
- DESIGNATED: the cache to be replaced has been designated
- TOREP: after DESIGNATED, if the designated cache is DIRTY, then replace it
- DONE: the replacement has been done
Support Vector Machines
Support Vector Machines (SVMs) are supervised learning models with associated learning algorithms that analyze data and recognize patterns, used for classification and regression analysis.
Maximal Margin Hyperplane
A Maximal Margin Hyperplane is the hyperplane that has the largest distance to the nearest training data point of any class. In general, the larger the margin the lower the generalization error of the classifier. Intuitively, if the margin is not large enough, a small disturbance of the separation border will bring significant influence.
Abour detail of why we should choose a maximal margin hyperplane, see Structural Risk Minimization (SRM).
Given some training data D, a set of n points of the form
where the yi is either 1 or -1, indicating the class to which the point xi belongs. Any hyperplane can be written as the set of points x satisfying
w.x - b = 0
If the training data are linearly separable, we can select two hyperplane in a way that they separate the data and there is no point between them, and then try to maximize their distance. The region bounded by them is called the margin. These hyperplanes can be described by the two following equations.
w.x - b = 1
w.x - b = -1
By using geomety, we find the distance or margin between these two hyperplanes is 2/||w||. we want to maximize the margin, so we need to minimize ||w||. As we also have to prevent data points from falling into the margin, we add some constaints: for each i, either of the two following inequation holds.
w.xi - b >= 1 for xi of the class labeled by 1
w.xi - b <= -1 for xi of the class labeled by -1
This can be rewritten as:
yi(w.xi - b) >= 1
Now the problem becomes:
Minimize ||w|| in w and b, subject to yi(w.xi - b) >= 1 for any i.
We can solve this problem with Lagrange Multiplier and Quadratic Programming techniques. And the Karush–Kuhn–Tucker condition implies that the solution can be expressed as a linear combination of the training vectors:
Only a few αi will be greater than zero. The corresponding xi are exactly the support vectors, which lie on the margin and satisfy yi(w.xi - b) = 1. And besides, all of the αi could be computed by Quadratic Programming. Then we can compute b, which is the population mean of w.xi - yi.
Deal with Linear Inseparable
If there exists no hyperplane that can split the two class, the Soft Margin method wiil choose a hyperplane tha splits the examples as cleanly as possible, while still maximizing the distance to the nearest cleanly split examples.
The method introduces non-negative slack variables ξi which measure the degree of misclassification of the data xi:
And then the problem becomes:
Maximize (in αi)
subject to 0 <= αi <= C and Σni=1αiyi = 0 for any i. C is a user given constant for measure Soft of the margin.
To deal with this situation, we need to replace evey dot product by a nonlinear kernel function. For detail read Wikipedia;