## 2014-11-14

Published on 2014-11-14## BMC

*Bounded Model Checking* (BMC) was first proposed by A. Biere et al. in 1999, and was elaborated by A. Biere in 2003 again. Unlike traditional model checking techniques, BMC uses a propositional SAT (*Boolean Satisfiability Problems* or *Propositional Satisfiability Problems*) solver rather than BDD (*Binary Decision Diagrams*) manipulation techniques.

BMC can find many logical errors in complex systems that can not be handled by competing techniques, and is therefore widely perceived as a **complementary** technique to BDD-based model checking.

### Basic Idea

The basic idea in BMC is to search for a counterexample in executions whose length is bounded by some integer *k*. If no bug is found then one increase *k* until either a bug is found, the problem becomes intractable, or some pre-known upper bound is reached (this bound is called the *Completeness Threshold* of the design). The BMC problem can be efficiently reduced to a propositional satisfiability problem, and can therefore be solved by SAT methods rather than BDDs. SAT procedures do not suffer from the space exposion problem of BDD-based methods. Modern SAT solvers can handle propositional satisfiability problems with hundreds of thousands of variables or more.

Although BMC aims at solving the same problems as traditional BDD-based symbolic model checking, it has two unique characteristics.

- First, the user has to provide a bound on the number of cycles that should be explored, which implies that the method is
**incomplete**if the bound is not high enough. - Second, it uses SAT techniques rather than BDDs.

Besides, experiments have shown that there is little correlation between what problems are hard for SAT and what problems are hard for BDD based techniques.Therefore, the classes of problems that are known to be hard for BDDs, can many times be solved with SAT, and vice versa.

### PLTL

This section is something about *Model Checking*.

*Propositional Linear Temporal Logic* (PLTL, or LTL for short) is an extension of propositional logic. From propositional logic LTL inherits Boolean variables and Boolean operators such as negation ¬, conjunction ∧, implication → etc. In addition to the Boolean connectives, LTL has temporal operators.

- the
*next time*operator**X**

The formula**X***p*specifies that property*p*holds at the next time step. - The
*Globally*operator**G**

A formula**G***p*holds along a path if*p*holds in all states of the path. This operator is used for safety properties. - The
*Finally*operator**F**

The formula**F***p*holds along a path if*p*holds somewhere on the path. This operator is used for liveness properties. - The
*Until*operator**U**

The formula*f***U***g*holds along a path iff*f*has to hold at least until*g*, which holds at the current or a future state. - The
*Release*operator**R**

The formula*f***R***g*holds along a path iff*g*has to be true until and including the point where*f*first becomes true; if*f*never becomes true,*g*must remain true forever.

### Formal formula in BMC

This is out of my motivation of learning this, so this section is omitted.

*k*-NN

The *k-Nearest Neighbors* algorithm (*k*-NN) is a non-parametric method used for classification and regression. In both cases, the input consists of the *k* closest training examples in the feature space. The output depends on whether *k*-NN is used for classification or regression.

*k*-NN is a type of instance-based learning, or lazy learning, where the function is only approximated locally and all computation is deferred until classification. The *k*-NN algorithm is among the simplest of all machine learning algorithms.

### Algorithm

- Training Phase

The training phase of*k*-NN algorithm consists only of storing the feature vectors and class labels of the training samples, which are vectors in a multidimensional feature space, each with a class label. - Classification Phase

In the classification phase,*k*is a user-defined constant, and an unlabeled vector is classified by assigning the label which is most frequent among the*k*training samples nearest to that query point. In*k*-NN regression, the output is the average of the values of its*k*nearest neighbors.

A drawback of the basic *majority voting* classification occurs when the class distribution is skewed, e.i., examples of a more frequent class tend to dominate the prediction of the new example, because they tend to be common among the *k* nearest neighbors due to their large number. There are two way to overcome this problem.

- One way to overcome this problem is to
**weight the classification**, taking into account the distance from the test point to each of its*k*nearest neighbors. - Another way to overcome skew is by
**abstraction**in data representation. For example, in a*self-organizing map*(SOM), each node is a representative of a cluster of similar points, regardless of their density in the original training data.

### Properties

- Feature extraction is important, and inappropriate features will lead to wrong classification. For example, if classify people with height measured by metres which can range from 1.5m to 1.85m and weight measured by pounds which range from 90 ~ 250, then the classifier will be dominated by weight, so measure of height must be re-extracted.
*k*-NN has some strong consistency results.*k*-NN is sensitive to noise.- The naive version of the algorithm is easy to implement by computing the distances from the test example to all stored examples, but it is computationally intensive for large training sets. Using an appropriate nearest neighbor search algorithm makes k-NN computationally tractable even for large data sets. Many nearest neighbor search algorithms have been proposed over the years; these generally seek to reduce the number of distance evaluations actually performed.