This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

11institutetext: Univ Angers, LERIA, SFR MATHSTIC, F-49000 Angers, France
11email: firstname.lastname@univ-angers.fr

GA and ILS for optimizing the size of NFA models

Frédéric Lardeux 11    Eric Monfroy 11
Abstract

Grammatical inference consists in learning a formal grammar (as a set of rewrite rules or a finite state machine). We are concerned with learning Nondeterministic Finite Automata (NFA) of a given size from samples of positive and negative words. NFA can naturally be modeled in SAT. The standard model [1] being enormous, we also try a model based on prefixes [2] which generates smaller instances. We also propose a new model based on suffixes and a hybrid model based on prefixes and suffixes. We then focus on optimizing the size of generated SAT instances issued from the hybrid models. We present two techniques to optimize this combination, one based on Iterated Local Search (ILS), the second one based on Genetic Algorithm (GA). Optimizing the combination significantly reduces the SAT instances and their solving time, but at the cost of longer generation time. We, therefore, study the balance between generation time and solving time thanks to some experimental comparisons, and we analyze our various model improvements.

keywords:
Constraint problem modeling, Grammar inference, SAT, model reformulation, NFA inference.

1 Introduction

Grammatical inference [3] (or grammar induction) is concerned with the study of algorithms for learning automata and grammars from some observations. The goal is thus to construct a representation that accounts for the characteristics of the observed objects. This research area plays a significant role in numerous applications, such as compiler design, bioinformatics, speech recognition, pattern recognition, machine learning, and others.

In this article, we focus on learning a finite automaton from samples of words S=S+SS=S^{+}\cup S^{-}, such that S+S^{+} is a set of positive words that must be accepted by the automaton, and SS^{-} is a set of negative words to be rejected by the automaton. Due to their determinism, deterministic finite automata (DFA) are generally faster than non deterministic automata (NFA). However, NFA are significantly smaller than DFA in terms of the number of states. Moreover, the space complexity of the SAT models representing the problem is generally due to the number of states. Thus, we focus here on NFA inference. An NFA is represented by a 5-tuple (Q,Σ,Δ,q1,F)(Q,\Sigma,\Delta,q_{1},F) where QQ is a finite set of states, the vocabulary Σ\Sigma is a finite set of symbols, the transition function Δ:Q×Σ𝒫(Q)\Delta:Q\times\Sigma\rightarrow{\mathcal{P}}(Q) associates a set of states to a given state and a given symbol, q1Qq_{1}\in Q is the initial state, and FQF\subseteq Q is the set of final states.

The problem of inferring NFA has been undertaken with various approaches (see, e.g., [1]). Among them, we can cite ad-hoc algorithms such as DeLeTe2 [4] that is based on state merging methods, or the technique of [5] that returns a collection of NFA. Some approaches use metaheuristics for computing NFA, such as hill-climbing [6] or genetic algorithm [7].

A convenient and declarative way of representing combinatorial problems is to model them as a Constraint Satisfaction Problem (CSP [8]) (see, e.g., [1] for an INLP model for inferring NFA, or [9] for a SAT (the propositional satisfiability problem [10]) model of the same problem). Parallel solvers have also been used for minimizing the inferred NFA size [11, 2].

Orthogonally to the approaches cited above, we do not seek to improve a solver, but to generate a model of the problem that is easier to solve with a standard SAT solver. Our approach is similar to DFA inference with graph coloring [12], or NFA inference with complex data structures [9]. Modeling thus consists in translating a problem into a CSP made of decision variables and constraints over these variables. As a reference for comparisons, we start with the basic SAT model of [9]. The model, together with a sample of positive and negative words, lead to a SAT instance to be solved by a classic SAT solver that we use as a black box. However, SAT instances are gigantic, e.g., our base model space complexity is in the order of 𝒪(k|ω+|){\mathcal{O}}(k^{|\omega_{+}|}) variables, and in 𝒪(|ω+|.k|ω+|)\mathcal{O}(|\omega_{+}|.k^{|\omega_{+}|}) clauses, where kk is the number of states of the NFA, and ω+\omega_{+} is the size of the longest positive word of the sample. The second model, PMPM, is based on intermediate variables for each prefix [2] which enables to compute only once parts of paths that are shared by several words. We propose a third model, SPSP, based on intermediate variables for suffixes. Although the two models could seem similar, their order of size is totally different. Indeed, PMPM is in 𝒪(k2)\mathcal{O}(k^{2}) while SMSM is in 𝒪(k3)\mathcal{O}(k^{3}). We then propose hybrid models consisting in splitting words into a prefix and a suffix. Modeling the beginning of the word is made with PMPM while the suffix is modeled by SMSM. The challenge is then to determine where to split words to optimize the size of the generated SAT instances. To this end, we propose two approaches, one based on iterated local search (ILS), the second one on genetic algorithm (GA). Both permit to generate smaller SAT instances, much smaller than with the DMDM model and even the PMPM model. However, with GA, the generation time is too long and erases the gain in solving with the Glucose SAT solver [13]. But the hybrid instances optimized with the ILS are smaller, and the generation time added to the solving time is faster than with PMPM. Compared to [9], which is the closest work on NFA inferring, we always obtain significantly smaller instances and solving time.

This paper is organized as follows. In Section 2 we present the direct model, the prefix model, and we propose the suffix model. We then combine suffix and prefix model to propose the new hybrid models (Section 3). Hybrid models are optimized with iterated local search (Sub-section 3.2), and with genetic algorithm in Sub-section 3.3. We then compare experimentally our models in Section 4 before concluding in Section 5.

2 SAT Models

Given an alphabet Σ={s1,,sn}\Sigma=\{s_{1},\ldots,s_{n}\} of nn symbols, a training sample S=S+SS=S^{+}\cup S^{-}, where S+S^{+} (respectively SS^{-}) is a set of positive words (respectively negative words) from Σ\Sigma^{*}, and an integer kk, the NFA inference problem consists in building a NFA with kk states which validates words of S+S^{+}, and rejects words of SS^{-}. Note that the satisfaction problem we consider in this paper can be extended to an optimization problem minimizing kk [2].

Let us introduce some notations. Let A=(Q,Σ,q1,F)A=(Q,\Sigma,q_{1},F) be a NFA with: Q={q1,,qk}Q=\{q_{1},\ldots,q_{k}\} a set of kk states, Σ\Sigma a finite alphabet, q1q_{1} the initial state, and FF the set of final states. The empty word is noted λ\lambda. We denote by KK the set of integers {1,,k}\{1,\ldots,k\}.

We consider the following variables:

  • kk the size of the NFA we want to learn,

  • a set of kk Boolean variables F={f1,,fk}F=\{f_{1},\ldots,f_{k}\} determining whether states q1q_{1} to qkq_{k} are final or not,

  • and Δ={δs,\vvqiqj|sΣ and i,jK}\Delta=\{\delta_{s,\vv{q_{i}q_{j}}}|s\in\Sigma\textrm{~{}and~{}}i,j\in K\} a set of n.k2n.k^{2} Boolean variables defining the existence or not of the transition from state qiq_{i} to state qjq_{j} with the symbol ss, for each qiq_{i}, qjq_{j}, and ss.

The path i1,i2,,in+1i_{1},i_{2},\ldots,i_{n+1} for w=w1wnw=w_{1}\ldots w_{n} exists if and only if d=δw1,\vvqi1qi2δwn,\vvqinqin+1d=\delta_{w_{1},\vv{q_{i_{1}}q_{i_{2}}}}\wedge\ldots\wedge\delta_{w_{n},\vv{q_{i_{n}}q_{i_{n+1}}}} is true. We say that the conjunction dd is a c_path, and Dw,\vvqiqjD_{{w},\vv{q_{i}q_{j}}} is the set of all c_paths for the word ww between states qiq_{i} and qjq_{j}.

2.1 Direct Model

This simple model has been presented in [9]. It is based on 3 sets of equations:

  1. 1.

    If the empty word is in S+S^{+} or SS^{-}, we can fix whether the first state is final or not:

    if λS+,f1\displaystyle\textrm{if }\lambda\in S^{+},~{}~{}~{}~{}~{}~{}f_{1} (1)
    if λS,¬f1\displaystyle\textrm{if }\lambda\in S^{-},~{}~{}~{}~{}\neg f_{1} (2)
  2. 2.

    For each word wS+w\in S^{+}, there is at least a path from q1q_{1} to a final state qjq_{j}:

    jKdDw,\vvq1qj(dfj)\displaystyle\bigvee_{j\in K}\bigvee_{~{}d\in D_{{w},\vv{q_{1}q_{j}}}}\big{(}d\wedge f_{j}\big{)} (3)

    With the Tseitin transformations [14], we create one auxiliary variable for each combination of a word ww, a state jKj\in K, and a c_path dDw,\vvq1qjd\in D_{{w},\vv{q_{1}q_{j}}}: auxw,j,ddfjaux_{w,j,d}\leftrightarrow d\wedge f_{j}. Hence, we obtain a formula in CNF for each ww:

    jKdDw,\vvq1qj[(¬auxw,j,d(dfj))]\displaystyle\bigwedge_{j\in K}\bigwedge_{~{}d\in D_{{w},\vv{q_{1}q_{j}}}}\left[(\neg aux_{w,j,d}\vee(d\wedge f_{j}))\right] (4)
    jKdDw,\vvq1qj(auxw,j,d¬d¬fj)\displaystyle\bigwedge_{j\in K}\bigwedge_{~{}d\in D_{{w},\vv{q_{1}q_{j}}}}(aux_{w,j,d}\vee\neg d\vee\neg f_{j}) (5)
    jKdDw,\vvq1qjauxw,j,d\displaystyle\bigvee_{j\in K}\bigvee_{~{}d\in D_{{w},\vv{q_{1}q_{j}}}}aux_{w,j,d} (6)
  3. 3.

    For each wSw\in S^{-} and each qjq_{j}, either there is no path state q1q_{1} to qjq_{j}, or qjq_{j} is not final:

    ¬[jKdDw,\vvq1qj(dfj)]\displaystyle\neg\left[\bigvee_{j\in K}\bigvee_{~{}d\in D_{{w},\vv{q_{1}q_{j}}}}\big{(}d\wedge f_{j}\big{)}\right] (7)

Thus, the direct constraint model DMkDM_{k} for building a NFA of size kk is:

DMk=wS+((4)(5)(6))wS(7)DM_{k}=\bigwedge_{w\in S^{+}}\Big{(}(\ref{aux1Mk})\wedge(\ref{aux2Mk})\wedge(\ref{aux3Mk})\Big{)}\wedge\bigwedge_{w\in S^{-}}(\ref{negM})

and is possibly completed by (1)(\ref{lambda1}) or (2)(\ref{lambda2}) if λS+\lambda\in S^{+} or λS\lambda\in S^{-}.

Size of the models (see [9] for details)

Consider ω+\omega_{+} and ω\omega_{-}, the longest word of S+S^{+} and SS^{-} respectively. Table 2 presents the number of clauses (Column 1) and their arities (Column 2), which are an upper bound of a given constraint group (last column) for the model SMkSM_{k}. Table 2 presents the upper bound of the number of Boolean variables that are required and why the are required. We can see on Tables 2 and 2 that the space complexity of the DMkDM_{k} is huge (𝒪(|S+|.k.|ω+|){\mathcal{O}}(|S^{+}|.k.^{|\omega_{+}|}) variables, and 𝒪(|S+|.(|ω+|+1).k|ω+|)\mathcal{O}(|S^{+}|.(|\omega_{+}|+1).k^{|\omega_{+}|}) clauses) and with large clauses (up to arity of |ω+|+2|\omega_{+}|+2), and that only small instances for a small number of states will be tractable. It is thus obvious that it is important to improve the model DMkDM_{k}.

number of cl.arityConstraints|S+|.(|ω+|+1).k|ω+|2(4)|S+|.k|ω+||ω+|+2(4)|S+|k|ω+|(4)|S|.k|ω||ω|+1(7)\begin{array}[]{|l|r|c|}\hline\cr\textrm{number of cl.}&\textrm{arity}&\textrm{Constraints}\\ \hline\cr|S^{+}|.(|\omega_{+}|+1).k^{|\omega_{+}|}&2&(\ref{aux1Mk})\\ |S^{+}|.k^{|\omega_{+}|}&|\omega_{+}|+2&(\ref{aux1Mk})\\ |S^{+}|&k^{|\omega_{+}|}&(\ref{aux1Mk})\\ |S^{-}|.k^{|\omega_{-}|}&|\omega_{-}|+1&(\ref{negM})\\ \hline\cr\end{array}

Table 1: Clauses for DMkDM_{k}

number of varreasonkfinal states Fn.k2transitions δ|S+|.k.|ω+|Constraints (3)\begin{array}[]{|l|l|}\hline\cr\textrm{number of var}&\textrm{reason}\\ \hline\cr k&\textrm{final states }F\\ n.k^{2}&\textrm{transitions }\delta\\ |S^{+}|.k.^{|\omega_{+}|}&\textrm{Constraints (\ref{m1})}\\ \hline\cr\end{array}

Table 2: Variables for DMkDM_{k}

2.2 Prefix Model [2]

Let Pref(w)Pref(w) be the set of all the non-empty prefixes of the word ww and, by extension, Pref(W)=wWPref(w)Pref(W)=\cup_{w\in W}Pref(w) the set of prefixes of the words of the set WW. For each wPref(S)w\in Pref(S), we add a Boolean variable pw,\vvq1qip_{w,\vv{q_{1}q_{i}}} which determines whether there is or not a c_path for ww from state q1q_{1} to qiq_{i}. Note that these variables can be seen as labels of the Prefix Tree Acceptor (PTA) for SS [3]. The problem can be modeled with the following constraints:

  1. 1.

    For all prefix w=aw=a with wPref(S)w\in Pref(S), and aΣa\in\Sigma, there is a c_path of size 1 for ww:

    iKδa,\vvq1qipa,\vvq1qi\displaystyle\bigvee_{i\in K}\delta_{a,\vv{q_{1}q_{i}}}\leftrightarrow p_{a,\vv{q_{1}q_{i}}} (8)

    With the Tseitin transformations, we can derive a CNF formula. It is also possible to directly encode δa,\vvq1qi\delta_{a,\vv{q_{1}q_{i}}} and pa,\vvq1qip_{a,\vv{q_{1}q_{i}}} as the same variable. Thus, no clause is required.

  2. 2.

    For all words wS+{λ}w\in S^{+}-\{\lambda\}:

    iKpw,\vvq1qifi\displaystyle\bigvee_{i\in K}p_{w,\vv{q_{1}q_{i}}}\wedge f_{i} (9)

    With the Tseitin transformations [14], we create one auxiliary variable for each combination of pw,\vvq1qip_{w,\vv{q_{1}q_{i}}} and the status (final or not) of the state qiq_{i}: auxw,ipw,\vvq1qifiaux_{w,i}\leftrightarrow p_{w,\vv{q_{1}q_{i}}}\wedge f_{i}. Hence, for each ww, we obtain a formula in CNF:

    iK((¬auxw,ipw,\vvq1qi)(¬auxw,ifi))\displaystyle\bigwedge_{i\in K}((\neg aux_{w,i}\vee p_{w,\vv{q_{1}q_{i}}})\wedge(\neg aux_{w,i}\vee f_{i})) (10)
    iK(auxw,i¬pw,\vvq1qi¬fi)\displaystyle\bigwedge_{i\in K}(aux_{w,i}\vee\neg p_{w,\vv{q_{1}q_{i}}}\vee\neg f_{i}) (11)
    iKauxw,i\displaystyle\bigvee_{i\in K}aux_{w,i} (12)
  3. 3.

    For all words wS{λ}w\in S^{-}-\{\lambda\}, we obtain the following CNF constraint:

    iK(¬pw,\vvq1qi¬fi)\displaystyle\bigwedge_{i\in K}(\neg p_{w,\vv{q_{1}q_{i}}}\vee\neg f_{i}) (13)
  4. 4.

    For all prefix w=vaw=va, wPref(S)w\in Pref(S), vPref(S)v\in Pref(S) and aΣa\in\Sigma:

    iK(pw,\vvq1qi(jKpv,\vvq1qjδa,\vvqjqi))\displaystyle\bigwedge_{i\in K}(p_{w,\vv{q_{1}q_{i}}}\leftrightarrow(\bigvee_{j\in K}p_{v,\vv{q_{1}q_{j}}}\wedge\delta_{a,\vv{q_{j}q_{i}}})) (14)

    Applying the Tseitin transformations, we create one auxiliary variable for each combination of existence of a c_path from q1q_{1} to qiq_{i} (pv,\vvq1qjp_{v,\vv{q_{1}q_{j}}}) and the transition δa,\vvqjqi\delta_{a,\vv{q_{j}q_{i}}}: auxv,a,j,ipv,\vvq1qjδa,\vvqjqiaux_{v,a,j,i}\leftrightarrow p_{v,\vv{q_{1}q_{j}}}\wedge\delta_{a,\vv{q_{j}q_{i}}}. Then, (14) becomes:

    iK(pw,\vvq1qi(jKauxv,a,j,i))\displaystyle\bigwedge_{i\in K}(p_{w,\vv{q_{1}q_{i}}}\leftrightarrow(\bigvee_{j\in K}aux_{v,a,j,i}))

    For each wPref(S)w\in Pref(S), we obtain constraints in CNF:

    (i,j)K2(¬auxv,a,j,ipw,\vvq1qi)\displaystyle\bigwedge_{(i,j)\in K^{2}}\ (\neg aux_{v,a,j,i}\vee p_{w,\vv{q_{1}q_{i}}}) (15)
    (i,j)K2(¬auxv,a,j,iδa,\vvqjqi)\displaystyle\bigwedge_{(i,j)\in K^{2}}(\neg aux_{v,a,j,i}\vee\delta_{a,\vv{q_{j}q_{i}}}) (16)
    (i,j)K2(auxv,a,j,i¬pw,\vvq1qi¬δa,\vvqjqi)\displaystyle\bigwedge_{(i,j)\in K^{2}}(aux_{v,a,j,i}\vee\neg p_{w,\vv{q_{1}q_{i}}}\vee\neg\delta_{a,\vv{q_{j}q_{i}}}) (17)
    iK(¬pw,\vvq1qi(jKauxv,a,j,i))\displaystyle\bigwedge_{i\in K}(\neg p_{w,\vv{q_{1}q_{i}}}\vee(\bigvee_{j\in K}aux_{v,a,j,i})) (18)
    (i,j)K2(pw,\vvq1qi¬auxv,a,j,i))\displaystyle\bigwedge_{(i,j)\in K^{2}}(p_{w,\vv{q_{1}q_{i}}}\vee\neg aux_{v,a,j,i})) (19)

Thus, the constraint prefix model PMkPM_{k} for building a NFA of size kk is:

PMk=wS+((10)(12))wS(13)wPref(S)(15)(19)PM_{k}=\bigwedge_{w\in S^{+}}\Big{(}(\ref{aux1pref})\wedge\ldots\wedge(\ref{aux3pref})\Big{)}\wedge\bigwedge_{w\in S^{-}}(\ref{negpref2})\wedge\bigwedge_{w\in Pref(S)}(\ref{auxprefrec1})\wedge\ldots\wedge(\ref{auxprefrec5})

and is possibly completed by (1)(\ref{lambda1}) or (2)(\ref{lambda2}) if λS+\lambda\in S^{+} or λS\lambda\in S^{-}.

Size of the models

Consider ω+\omega_{+}, the longest word of S+S^{+}, ω\omega_{-}, the longest word of SS^{-}, σ=ΣwS|w|\sigma=\Sigma_{w\in S}|w|, and π\pi, the number of prefix obtained by Pref(S)Pref(S) with a size larger than 1 (π=|{x|xPref(S),|x|>1}|\pi=|\{x|x\in Pref(S),|x|>1\}|), then:

max(|ω+|,|ω|)πσ|S+|.|ω+|+|S|.|ω|max(|\omega_{+}|,|\omega_{-}|)\leq\pi\leq\sigma\leq|S^{+}|.|\omega_{+}|+|S^{-}|.|\omega_{-}|

The space complexity of the PMkPM_{k} model is thus in 𝒪(σ.k2){\mathcal{O}}(\sigma.k^{2}) variables, and in 𝒪(σ.k2){\mathcal{O}}(\sigma.k^{2}) binary and ternary clauses, and 𝒪(σ.k){\mathcal{O}}(\sigma.k) (k+1k+1)-ary clauses.

number of cl.arityConstraints2.k.|S+|2(10)k.|S+|3(11)kk+1(12)k.|S|2(13)π.k22(15)π.k22(16)π.k23(17)π.kk+1(18)π.k22(19)\begin{array}[]{|l|r|c|}\hline\cr\textrm{number of cl.}&\textrm{arity}&\textrm{Constraints}\\ \hline\cr 2.k.|S^{+}|&2&(\ref{aux1pref})\\ k.|S^{+}|&3&(\ref{aux2pref})\\ k&k+1&(\ref{aux3pref})\\ k.|S^{-}|&2&(\ref{negpref2})\\ \pi.k^{2}&2&(\ref{auxprefrec1})\\ \pi.k^{2}&2&(\ref{auxprefrec2})\\ \pi.k^{2}&3&(\ref{auxprefrec3})\\ \pi.k&k+1&(\ref{auxprefrec4})\\ \pi.k^{2}&2&(\ref{auxprefrec5})\\ \hline\cr\end{array}

Table 3: Clauses for PMkPM_{k}

number of varreasonkfinal states Fn.k2transitions δ|S+|.kConstraints (9)π.k2Constraints (14)\begin{array}[]{|l|l|}\hline\cr\textrm{number of var}&\textrm{reason}\\ \hline\cr k&\textrm{final states }F\\ n.k^{2}&\textrm{transitions }\delta\\ |S^{+}|.k&\textrm{Constraints (\ref{pref1})}\\ \pi.k^{2}&\textrm{Constraints (\ref{prefrec})}\\ \hline\cr\end{array}

Table 4: Variables for PMkPM_{k}

2.3 Suffix Model

We now propose a suffix model (SMkSM_{k}), based on Suf(S)Suf(S), the set of all the non-empty suffixes of all the words in SS. The main difference is that the construction starts from every state and terminates in state q1q_{1}. For each wSuf(S)w\in Suf(S), we add a Boolean variable pw,\vvqiqjp_{w,\vv{q_{i}q_{j}}} which determines whether there is or not a c_path for ww from state qiq_{i} to qjq_{j}. To model the problem, Constraints (10), (11), (12), and (13) remain unchanged and creation of the corresponding auxiliary variables auxw,iaux_{w,i} as well.

For each suffix w=aw=a with wSuf(S)w\in Suf(S), and aΣa\in\Sigma, there is a c_path of size 1 for ww:

(i,j)K2δa,\vvqiqjpa,\vvqiqj\displaystyle\bigvee_{(i,j)\in K^{2}}\delta_{a,\vv{q_{i}q_{j}}}\leftrightarrow p_{a,\vv{q_{i}q_{j}}} (20)

We can directly encode δa,\vvqiqj\delta_{a,\vv{q_{i}q_{j}}} and pa,\vvqiqjp_{a,\vv{q_{i}q_{j}}} as the same variable. Thus, no clause is required.

For all suffix w=avw=av, wSuf(S)w\in Suf(S), vSuf(S)v\in Suf(S) and aΣa\in\Sigma:

(i,j)K2(pw,\vvqiqj(kKδa,\vvqiqkpv,\vvqkqj))\displaystyle\bigwedge_{(i,j)\in K^{2}}(p_{w,\vv{q_{i}q_{j}}}\leftrightarrow(\bigvee_{k\in K}\delta_{a,\vv{q_{i}q_{k}}}\wedge p_{v,\vv{q_{k}q_{j}}})) (21)

We create one auxiliary variable for each combination of existence of a c_path from qkq_{k} to qjq_{j} (pv,\vvqkqjp_{v,\vv{q_{k}q_{j}}}) and the transition δa,\vvqiqk\delta_{a,\vv{q_{i}q_{k}}}: auxv,a,i,k,jδa,\vvqiqkpv,\vvqkqjaux_{v,a,i,k,j}\leftrightarrow\delta_{a,\vv{q_{i}q_{k}}}\wedge p_{v,\vv{q_{k}q_{j}}}

For each w=avw=av, we obtain the following constraints (CNF formulas):

(i,j,k)K3(¬auxv,a,i,k,jpw,\vvqkqj)\displaystyle\bigwedge_{(i,j,k)\in K^{3}}(\neg aux_{v,a,i,k,j}\vee p_{w,\vv{q_{k}q_{j}}}) (22)
(i,j,k)K3(¬auxv,a,i,k,jδa,\vvqiqk)\displaystyle\bigwedge_{(i,j,k)\in K^{3}}(\neg aux_{v,a,i,k,j}\vee\delta_{a,\vv{q_{i}q_{k}}}) (23)
(i,j,k)K3(auxv,a,i,k,j¬pw,\vvqkqj¬δa,\vvqiqk)\displaystyle\bigwedge_{(i,j,k)\in K^{3}}(aux_{v,a,i,k,j}\vee\neg p_{w,\vv{q_{k}q_{j}}}\vee\neg\delta_{a,\vv{q_{i}q_{k}}}) (24)
(i,j)K2(¬pw,\vvqiqj(kKauxv,a,i,k,j))\displaystyle\bigwedge_{(i,j)\in K^{2}}(\neg p_{w,\vv{q_{i}q_{j}}}\vee(\bigvee_{k\in K}aux_{v,a,i,k,j})) (25)
(i,j,k)K3(pw,\vvqiqj¬auxv,a,i,k,j))\displaystyle\bigwedge_{(i,j,k)\in K^{3}}(p_{w,\vv{q_{i}q_{j}}}\vee\neg aux_{v,a,i,k,j})) (26)

Note that some clauses are not worth being generated. Indeed, it is useless to generate paths starting in states different from the initial state q1q_{1}, except when the ww is in SS, and ww is also the suffix of another word from SS. Removing these constraints does not change the complexity of the model. This can easily be done at generation time, or we can leave it to the solver, which will detect it and remove the useless constraints.

Thus, the constraint prefix model PMkPM_{k} for building a NFA of size kk is:

SMk=wS+((10)(12))wS(13)wPref(S)S(22)(26)SM_{k}=\bigwedge_{w\in S^{+}}\Big{(}(\ref{aux1pref})\wedge\ldots\wedge(\ref{aux3pref})\Big{)}\wedge\bigwedge_{w\in S^{-}}(\ref{negpref2})\wedge\bigwedge_{w\in Pref(S)\setminus S}(\ref{auxsufrec1})\wedge\ldots\wedge(\ref{auxsufrec5})

and is possibly completed by (1)(\ref{lambda1}) or (2)(\ref{lambda2}) if λS+\lambda\in S^{+} or λS\lambda\in S^{-}.

Size of the models

Consider ω+\omega_{+}, ω\omega_{-}, σ\sigma, and π\pi as defined in the prefix model. Table 6 presents the number of clauses (first column) and their arities (Column 2) which are an upper bound of a given constraint group (last column) for the model SMkSM_{k}. Table 6 presents the upper bound of the number of Boolean variables that are required, and the reason of their requirements. To simplify, the space complexity of SMkSM_{k} is thus in 𝒪(σ.k3){\mathcal{O}}(\sigma.k^{3}) variables, and in 𝒪(σ.k3){\mathcal{O}}(\sigma.k^{3}) binary and ternary clauses, and 𝒪(σ.k2){\mathcal{O}}(\sigma.k^{2}) (k+1k+1)-ary clauses.

number of cl.arityConstraints2.k.|S+|2(10)k.|S+|3(11)kk+1(12)k.|S|2(13)π.k32(22)π.k32(23)π.k33(24)π.k2k+1(25)π.k32(26)\begin{array}[]{|l|r|c|}\hline\cr\textrm{number of cl.}&\textrm{arity}&\textrm{Constraints}\\ \hline\cr 2.k.|S^{+}|&2&(\ref{aux1pref})\\ k.|S^{+}|&3&(\ref{aux2pref})\\ k&k+1&(\ref{aux3pref})\\ k.|S^{-}|&2&(\ref{negpref2})\\ \pi.k^{3}&2&(\ref{auxsufrec1})\\ \pi.k^{3}&2&(\ref{auxsufrec2})\\ \pi.k^{3}&3&(\ref{auxsufrec3})\\ \pi.k^{2}&k+1&(\ref{auxsufrec4})\\ \pi.k^{3}&2&(\ref{auxsufrec5})\\ \hline\cr\end{array}

Table 5: Clauses for SMkSM_{k}

number of varreasonkfinal states Fn.k2transitions δ|S+|.kConstraints (9)π.k3Constraints (21)\begin{array}[]{|l|l|}\hline\cr\textrm{number of var}&\textrm{reason}\\ \hline\cr k&\textrm{final states }F\\ n.k^{2}&\textrm{transitions }\delta\\ |S^{+}|.k&\textrm{Constraints (\ref{pref1})}\\ \pi.k^{3}&\textrm{Constraints (\ref{sufrec})}\\ \hline\cr\end{array}

Table 6: Variables for SMkSM_{k}

3 Hybrid Models

We now propose a family of models based on both the notion of prefix and the notion of suffix. The idea is, in fact, to take advantage of the construction of a prefix pp and a suffix ss of a word ww such that w=p.sw=p.s to pool both prefixes and suffixes. The goal is to reduce the size of generated SAT instances. The process is the following:

  1. 1.

    For each word wiw_{i} of SS, we split wiw_{i} into pip_{i} and sis_{i} such that w=pi.siw=p_{i}.s_{i}. We thus obtain two sets, Sp={pi|i,wiS and wi=pi.si}S_{p}=\{p_{i}~{}|~{}\exists i,w_{i}\in S\textrm{ and }w_{i}=p_{i}.s_{i}\} and Ss={si|i,wiS and wi=pi.si}S_{s}=\{s_{i}~{}|~{}\exists i,w_{i}\in S\textrm{ and }w_{i}=p_{i}.s_{i}\}.

  2. 2.

    We then consider SpS_{p} as a sample, i.e., a set of words. For each ww of SpS_{p}, we generate Constraints (15) to (19).

  3. 3.

    We consider SsS_{s} in turn to generate Constraints (22) to (26) for each wSsw\in S_{s}.

  4. 4.

    Then, for each wi=pi.siw_{i}=p_{i}.s_{i}, clauses corresponding to pip_{i} must be linked to clauses of sis_{i}.

    • if wi=pi.siSw_{i}=p_{i}.s_{i}\in S^{-}, the constraints are similar to the ones of (13) including the connection of pip_{i} and sis_{i}:

      (j,k)K2(¬ppi,\vvq1qj¬psi,\vvqjqk¬fi)\displaystyle\bigwedge_{(j,k)\in K^{2}}(\neg p_{p_{i},\vv{q_{1}q_{j}}}\vee\neg p_{s_{i},\vv{q_{j}q_{k}}}\vee\neg f_{i}) (27)
    • if wi=pi.siS+w_{i}=p_{i}.s_{i}\in S^{+}, the constraints are similar to ( 9):

      (j,k)K2ppi,\vvq1qjpsi,\vvqjqkfk\displaystyle\bigvee_{(j,k)\in K^{2}}p_{p_{i},\vv{q_{1}q_{j}}}\wedge p_{s_{i},\vv{q_{j}q_{k}}}\wedge f_{k} (28)

      We transform (28) using auxiliary variables auxwi,j,kpw,\vvq1qjpw,\vvqjqkfiaux_{w_{i},j,k}\leftrightarrow p_{w,\vv{q_{1}q_{j}}}\wedge p_{w,\vv{q_{j}q_{k}}}\wedge f_{i} to obtain the following CNF constraints:

      (j,k)K2((¬auxwi,j,kpw,\vvq1qj)(¬auxwi,j,kpw,\vvqjqk)(¬auxwi,j,kfk))\displaystyle\bigwedge_{(j,k)\in K^{2}}((\neg aux_{w_{i},j,k}\vee p_{w,\vv{q_{1}q_{j}}})\wedge(\neg aux_{w_{i},j,k}\vee p_{w,\vv{q_{j}q_{k}}})\wedge(\neg aux_{w_{i},j,k}\vee f_{k})) (29)
      (j,k)K2(auxwi,j,k¬pw,\vvq1qjpw,\vvqjqk¬fk)\displaystyle\bigwedge_{(j,k)\in K^{2}}(aux_{w_{i},j,k}\vee\neg p_{w,\vv{q_{1}q_{j}}}\vee p_{w,\vv{q_{j}q_{k}}}\vee\neg f_{k}) (30)
      (j,k)K2auxwi,j,k\displaystyle\bigvee_{(j,k)\in K^{2}}aux_{w_{i},j,k} (31)

Thus, the hybrid model HMkHM_{k} for building a NFA of size kk is:

HMk=wS+((29)(31))wS(27)piPref(Sp)(22)(26)siSuf(Ss)(15)(19)HM_{k}=\bigwedge_{w\in S^{+}}\Big{(}(\ref{hybaux1pref})\wedge\ldots\wedge(\ref{hybaux3pref})\Big{)}\wedge\bigwedge_{w\in S^{-}}(\ref{hybneg})\wedge\bigwedge_{p_{i}\in Pref(S_{p})}(\ref{auxsufrec1})\wedge\ldots\wedge(\ref{auxsufrec5})\bigwedge_{s_{i}\in Suf(S_{s})}(\ref{auxprefrec1})\wedge\ldots\wedge(\ref{auxprefrec5})

and it is possibly completed by (1)(\ref{lambda1}) or (2)(\ref{lambda2}) if λS+\lambda\in S^{+} or λS\lambda\in S^{-}.

We do not detail it here, but in the worst case, the complexity of the model is the same as SMkSM_{k}. It is obvious that the split of each word into a prefix and a suffix will determine the size of the instance. The next sub-sections are dedicated to the computation of this separation wi=pi.siw_{i}=p_{i}.s_{i} to minimize the size of the generated hybrid instances with the HMkHM_{k} model.

3.1 Search Space and Evaluation Function For Metaheuristics

The search space 𝒳\cal X of this problem corresponds to all the hybrid models: for each word ww of SS, we have to determine a nn such that w=p.sw=p.s with |p|=n|p|=n and |s|=|w|n|s|=|w|-n. The size of the search space is thus: |𝒳|=ΠwS|w|+1|{\cal X}|=\Pi_{w\in S}|w|+1.

Even though we are aware that smaller instances are not necessarily easier to solve, we choose to define the first evaluation function as the number of generated SAT variables. However, this number cannot be computed a priori: first, the instance has to be generated, before counting the variables. This function being too costly, we propose an alternative evaluation function for approximating the number of variables. This fitness function is based on the number of prefixes in Pref(Sp)Pref(S_{p}) and suffixes in Suf(Ss)Suf(S_{s}). Since the complexity of SMkSM_{k} is in 𝒪(k3){\cal O}(k^{3}) whereas the complexity of PMkPM_{k} is in 𝒪(k2){\cal O}(k^{2}), suffixes are penalized by a coefficient corresponding to the number of states.

fitness(Sp,Ss)=|Pref(Sp)|+k.|Suf(Ss)|fitness(S_{p},S_{s})=|Pref(S_{p})|+k.|Suf(S_{s})|

Empirically, we observe that the results of this fitnessfitness function are proportional to the actual number of generated SAT variables. This approximation of the number of variables will thus be the fitness function in our ILS and GA algorithms.

3.2 Iterated Local Search Hybrid Model HM_ILSkHM\_ILS_{k}

We propose an Iterated Local Search (ILS) [15] for optimizing our hybrid model. Classically, a best improvement or a first improvement neighborhood is used in ILS to select the next move. In our case, a first improvement provides very poor results. Moreover, it is clearly impossible to evaluate all the neighbors at each step due to the computing cost. We thus decide to randomly choose a word in SS with a roulette wheel selection based on the word weights. Each word ww has a weight corresponding for 75% to a characteristic of SS, and 25% to the length of the word:

weightw=75%/|S|+25%|w|/(wiS|wi|)\textrm{weight}_{w}=75\%/|S|+25\%*|w|/(\sum_{w_{i}\in S}|w_{i}|)

The search starts generating a random couple of prefixes and suffixes sets (SpS_{p},SsS_{s}), i.e., for each word ww of SS an integer is selected for splitting ww into a prefix pp and a suffix ss such that w=p.sw=p.s. Hence, at each iteration, the best couple (p,s)(p,s) is found for the selected word ww. This process is iterated until a maximum number of iterations is reached.

In our ILS, it is not necessary to introduce noise with random walks or restarts because our process of selection of word naturally ensures diversification.

10:  set of words SS, maximum number of iterations max_itermax\_iter
maximum of consecutive iterations allowed without improvement max_iter_without_improvmax\_iter\_without\_improv
0:  set of prefixes SpS^{*}_{p}, set of suffixes SsS^{*}_{s}
1:  Couple of prefixes and suffixes sets (SpS_{p},SsS_{s}) is randomly generated
2:  (SpS^{*}_{p},SsS^{*}_{s}) = (SpS_{p},SsS_{s})
3:  repeat
4:     Choose a word ww in SS with a roulette wheel selection
5:     (SpS_{p},SsS_{s}) is updated by the best couple of the sub-search space corresponding only to a modification of the prefix and the suffix of word ww
6:     if fitness(Sp,Ss)<fitness(Sp,Ss)fitness(S_{p},S_{s})<fitness(S^{*}_{p},S^{*}_{s}) then
7:        (SpS^{*}_{p},SsS^{*}_{s}) = (SpS_{p},SsS_{s})
8:     end if
9:  until maximum number of iterations max_itermax\_iter is reached or (SpS^{*}_{p},SsS^{*}_{s}) is not improved since max_iter_without_improvmax\_iter\_without\_improv iterations
10:  return  (Sp,Ss)(S^{*}_{p},S^{*}_{s})
Algorithm 1 Iterated Local Search

3.3 Genetic Algorithm Hybrid Model HM_GAkHM\_GA_{k}

We propose a classical genetic algorithm (GA) based on the search space and fitness function presented in Section 3.1. A population of individuals, represented by a couple of prefixes and suffixes sets, is improved generation after generation. Each generation keeps a portion of individuals as parents and creates children by crossing the selected parents. Crossover operator used in our GA is the well-known uniform crossover. For each word, children inherit the prefix and the suffix of one of their parents randomly chosen. Since the population size is the same during all the search, we have a steady-state GA. A mutation process is applied over all individuals with a probability pmutp_{mut}. For each word ww, each prefix and suffix are randomly mutated by generating an integer nn between 0 and |w||w| splitting ww into a new prefix of size nn and a new suffix |w|n|w|-n. The search stops when the maximum number of generations is reached or when no improvement is observed in the population during max_gen_without_improvmax\_gen\_without\_improv generations.

10:  set of words SS, population size s𝒫s_{\cal{P}}, mutation probability pmutp_{mut},
2maximum number of generations max_genmax\_gen,
3portion of population conserve in the next generation pparentsp_{parents},
maximum of consecutive generations allowed without improvement max_gen_without_improvmax\_gen\_without\_improv
0:  set of prefixes SpS^{*}_{p}, set of suffixes SsS^{*}_{s}
1:  Population 𝒫\cal{P} of couples of prefixes and suffixes sets (SpS_{p},SsS_{s}) is randomly generated
2:  (SpS^{*}_{p},SsS^{*}_{s}) = Argminfitness(𝒫\cal{P})
3:  repeat
4:     Select as parents set ParPar a portion pparentsp_{parents} of 𝒫\cal{P}
5:     Generate (1pparents).s𝒫(1-p_{parents}).s_{\cal{P}} children by uniform crossover over parents in a set ChildrenChildren
6:     𝒫=ParChildren{\cal{P}}=Par\cup Children
7:     Mutate for each individual of 𝒫\cal{P} the prefix/suffix for each words of SS with a probability pmutp_{mut}
8:     Update the population
9:     Update (SpS^{*}_{p},SsS^{*}_{s}) if necessary
10:  until maximum number of generations max_genmax\_gen is reached or (SpS^{*}_{p},SsS^{*}_{s}) is not improved since max_gen_without_improvmax\_gen\_without\_improv generations
11:  return  SpS^{*}_{p} and SsS^{*}_{s}
Algorithm 2 Genetic Algorithm

4 Experimental results

To test our new models, we work on the training set of the StaMinA Competition (see http://stamina.chefbe.net). We use 11 of the instances selected in [2]111We kept the ”official” name used in [2]. with a sparsity s{12.5%,25%,50%,100%}s\in\{12.5\%,25\%,50\%,100\%\} and an alphabet size |Σ|{2,5,10}|\Sigma|\in\{2,5,10\}. We try to generate SAT instances for NFA sizes (kk) near to the threshold of the existence or not of an NFA.

4.1 Experimental Protocol

All our algorithms are implemented in Python using specific libraries such as Pysat. The experiments were carried out on a computing cluster with Intel-E5-2695 CPUs, and a limit of 10 GB of memory was fixed. Running times were limited to 10 minutes, including generation of the model and solving time. We used the Glucose [13] SAT solver with the default options. For stochastic methods (ILS and GA), 30 runs are realized to exploit the results statistically.

Parameters used for our hybrid models are:

ILS AG
max_itermax\_iter 10 000 s𝒫s_{\cal{P}} 100
max_iter_without_improvmax\_iter\_without\_improv 100 max_genmax\_gen 3000
max_gen_without_improvmax\_gen\_without\_improv 100
pmutp_{mut} 0.05
pparentsp_{parents} 0.03

4.2 Results

Our experiments are reported in Table 7. The first column (InstanceInstance) corresponds to the official name of the instance, and the second one (kk) to the number of states of the expected NFA. Then, we have in sequence the model name (ModelModel), the number of SAT variables (Var.Var.), the number of clauses (Cl.Cl.), and the instance generation time (tMt_{M}). The right part of the table corresponds to the solving part with the satisfiability of the generated instance (SATSAT), the decisions number (Dec.Dec.), and the solving time (tSt_{S}) with Glucose. Finally, the last column (tTt_{T}) corresponds to the total time (modeling time + solving time). Results for hybrid models based on ILS (HM_ILSkHM\_ILS_{k}) and GA (HM_GAkHM\_GA_{k}) correspond to average values over 30 runs. We have decided to only provide the average since the standard deviation values are very small.

The last lines of the table correspond to the cumulative values for each column and each model. When an instance is not solved (time-out), the maximum value needed for solving the other model instances is considered. For the instance generation time (tMt_{M}), a credit of 600600 seconds is applied when generation did not succeed before the time-out.

We can clearly confirm that the direct model is not usable in practice, and that instances cannot be generated in less than 600s.600s. The prefix model allows the fastest generation when it terminates before the time out (on these benchmarks, it did not succeed once and was thus penalize for cumulative values). It also provides instances that are solved quite fast. As expected, the instances optimized with GA are the smallest ones. However, the generation is too costly: the gain in solving time is not sufficient to compensate the long generation time. In total, in terms of solving+generation time, GA based model is close to prefix model. As planned with its space complexity (in 𝒪(k3){\cal{O}}(k^{3})), suffix based instances are huge and long to solve. However, we were surprised for 2 benchmarks (ww-10-40 and ww-10-50) for which the generated instances are relatively big (5 times the size of the GA optimized instances), but their solving is the fastest. We still cannot explain what made these instances easy to solve, and we are still investigating their structure. The better balance is given with the ILS model: instances are relatively small, the generation time is fast, and the solving time as well. This is thus the best option of this work.

It is very difficult to compare our results with the results of [2]. First of all, in [2], they try to minimize kk, the number of states. Moreover, they use parallel algorithms. Finally, they do not detail the results for each instance and each kk, except for st-2-30 and st-5-50. For the first one, with k=9k=9 we are much faster. But for the second one, with k=5k=5 we are slower.

Table 7: Comparison on 11 instances between the models DMkDM_{k}, PMkPM_{k}, SMkSM_{k}, HM_ILSkHM\_ILS_{k}, and HM_GAkHM\_GA_{k}.
Instance k Model Var. Cl. tMt_{M} SAT Dec. tSt_{S} tTt_{T}
DMkDM_{k} 190 564 1 817 771 13.46 True 973 213 88.62 102.09
PMkPM_{k} 1 276 4 250 1.27 True 3 471 0.10 1.37
st-2-10 4 SMkSM_{k} 5 196 17 578 1.30 True 4 332 0.21 1.51
HM_ILSkHM\_ILS_{k} 1 188 4 179 4.14 True 2 503 0.05 4.18
HM_GAkHM\_GA_{k} 1 107 3 884 14.45 True 2 368 0.05 14.50
DMkDM_{k} - - - - - - -
PMkPM_{k} 4 860 17 150 1,34 False 1 625 706 241,24 242,59
st-2-20 6 SMkSM_{k} - - - - - - -
HM_ILSkHM\_ILS_{k} 5 688 21 073 5,39 False 662 354 98,35 103,74
HM_GAkHM\_GA_{k} 4 735 17 611 34,65 False 708 356 94,95 129,61
DMkDM_{k} - - - - - - -
PMkPM_{k} - - - - - - -
st-2-30 9 SMkSM_{k} - - - - - - -
HM_ILSkHM\_ILS_{k} 20 637 78 852 7.55 True 1 998 574 228.53 236.07
HM_GAkHM\_GA_{k} 16 335 62 832 66.94 True 4 079 686 527.44 594.38
DMkDM_{k} - - - - - - -
PMkPM_{k} 4 024 13 464 1.49 True 2 641 0.08 1.57
st-5-20 4 SMkSM_{k} 14 964 50 660 1.68 True 23 540 3.23 4.91
HM_ILSkHM\_ILS_{k} 3 608 12 514 7.83 True 14 584 0.72 8.56
HM_GAkHM\_GA_{k} 3 522 12 180 47.84 True 18 344 0.94 48.78
DMkDM_{k} - - - - - - -
PMkPM_{k} 5 364 18 054 1.43 True 177 711 21.57 23.00
st-5-30 4 SMkSM_{k} 21 084 71 502 1.87 True 362 318 128.02 129.89
HM_ILSkHM\_ILS_{k} 4 837 16 955 9.90 True 156 631 19.90 29.81
HM_GAkHM\_GA_{k} 4 705 16 478 119.42 True 171 062 21.67 141.09
DMkDM_{k} - - - - - - -
PMkPM_{k} 6 284 21 216 1.52 False 7 110 0.55 2.07
st-5-40 4 SMkSM_{k} 23 604 80 104 1.55 False 15 708 1.74 3.29
HM_ILSkHM\_ILS_{k} 5 745 20 290 10.29 False 6 206 0.34 10.62
HM_GAkHM\_GA_{k} 5 548 19 517 150.50 False 6 204 0.35 150.85
DMkDM_{k} - - - - - - -
PMkPM_{k} 11 150 38 745 1.59 False 1 943 735 562.80 564.39
st-5-50 5 SMkSM_{k} - - - - - - -
HM_ILSkHM\_ILS_{k} 11 085 40 258 10.80 False 911 280 238.10 248.90
HM_GAkHM\_GA_{k} 10 040 36 350 279.87 False 1 093 093 287.46 567.33
DMkDM_{k} - - - - - - -
PMkPM_{k} 14 200 49 455 1.52 False 1 245 538 383.37 384.89
st-5-60 5 SMkSM_{k} - - - - - - -
HM_ILSkHM\_ILS_{k} 13 920 50 568 13.47 False 800 920 231.82 245.29
HM_GAkHM\_GA_{k} 13 180 47 755 313.30 False 950 601 270.97 584.26
DMkDM_{k} 15 012 112 039 2.07 True 69 219 1.52 3.59
PMkPM_{k} 3 624 11 900 1.38 True 977 0.03 1.41
ww-10-40 4 SMkSM_{k} 13 844 46 648 1.25 True 4 173 0.02 1.28
HM_ILSkHM\_ILS_{k} 2 896 10 342 5.94 True 3 897 0.06 6.00
HM_GAkHM\_GA_{k} 2 761 9 839 75.57 True 2 842 0.04 75.60
DMkDM_{k} 80 548 694 641 5.61 True 483 153 103.52 109.14
PMkPM_{k} 5 364 17 850 1.28 True 167 390 20.29 21.57
ww-10-50 4 SMkSM_{k} 20 844 70 482 1.49 True 74 482 11.58 13.07
HM_ILSkHM\_ILS_{k} 4 633 16 514 7.71 True 73 534 5.46 13.17
HM_GAkHM\_GA_{k} 4 517 15 940 123.21 True 52 894 3.38 126.59
DMkDM_{k} 397 451 3 014 842 4 221,15 - 10 821 816 2 041,50 6 262,65
PMkPM_{k} 76 783 270 936 612,82 - 9 253 965 1 757,47 2 370,29
Cumulative values SMkSM_{k} 151 211 525 099 2 409,15 - 9 379 218 1 879,65 4 268,80
HM_ILSkHM\_ILS_{k} 74 237 271 543 83,03 - 4 630 483 823,33 906,36
HM_GAkHM\_GA_{k} 66 450 242 386 1 225,75 - 7 085 451 1 207,23 2 432,98

5 Conclusion

In this paper, we have proposed to use some metaheuristics algorithms, namely ILS and GA, to improve the size of SAT models for the NFA inferring problem. Our hybrid model, optimized with GA gives, on average, the smallest SAT instances. Solving these instances is also faster than with the direct or prefix models. However, generation of the optimized instances with GA is really too long and is not balanced out with the gain in solving time; it is at the level of the prefix model w.r.t. total CPU time. The ILS model generates optimized instances a bit larger than with GA and a bit smaller than with prefixes. Moreover, the solving time is the best of our experiments, and the generation time added to the solving time makes of the HM_ILSkHM\_ILS_{k} our better model.

In the future, we plan to speed up GA to make it more competitive. We also plan to consider more complex fitness functions, not only based on the number of SAT variables but also on the length of clauses. We also plan a model portfolio approach for larger samples.

References

  • [1] Wieczorek, W.: Grammatical Inference - Algorithms, Routines and Applications. Volume 673 of Studies in Computational Intelligence. Springer (2017)
  • [2] Jastrzab, T., Czech, Z.J., Wieczorek, W.: Parallel algorithms for minimal nondeterministic finite automata inference. Fundam. Informaticae 178 (2021) 203–227
  • [3] de la Higuera, C.: Grammatical Inference: Learning Automata and Grammars. Cambridge University Press (2010)
  • [4] Denis, F., Lemay, A., Terlutte, A.: Learning regular languages using rfsas. Theor. Comput. Sci. 313 (2004) 267–294
  • [5] Vázquez de Parga, M., García, P., Ruiz, J.: A family of algorithms for non deterministic regular languages inference. In: Proc. of CIAA 2006. Volume 4094 of LNCS., Springer (2006) 265–274
  • [6] Tomita, M.: Dynamic construction of finite-state automata from examples using hill-climbing. Proc. of the Annual Conference of the Cognitive Science Society (1982) 105–108
  • [7] Dupont, P.: Regular grammatical inference from positive and negative samples by genetic search: the GIG method. In: Proc. of ICGI 94. Volume 862 of LNCS., Springer (1994) 236–245
  • [8] Rossi, F., van Beek, P., Walsh, T., eds.: Handbook of Constraint Programming. 1st edn. Elsevier Science (2006)
  • [9] Lardeux, F., Monfroy, E.: Improved SAT models for NFA learning. In: Proc. of OLA 2021. Communications in Computer and Information Science, Springer (2021) In press.
  • [10] Garey, M.R., Johnson, D.S.: Computers and Intractability, A Guide to the Theory of NP-Completeness. W.H. Freeman & Company, San Francisco (1979)
  • [11] Jastrzab, T.: Two parallelization schemes for the induction of nondeterministic finite automata on pcs. In: Proc. of PPAM 2017. Volume 10777 of LNCS., Springer (2017) 279–289
  • [12] Heule, M., Verwer, S.: Software model synthesis using satisfiability solvers. Empirical Software Engineering 18 (2013) 825–856
  • [13] Audemard, G., Simon, L.: Predicting learnt clauses quality in modern SAT solvers. In: Proc. of IJCAI 2009. (2009) 399–404
  • [14] Tseitin, G.S. In: On the Complexity of Derivation in Propositional Calculus. Springer Berlin Heidelberg, Berlin, Heidelberg (1983) 466–483
  • [15] Stützle, T., Ruiz, R. In: Iterated Local Search. Springer International Publishing, Cham (2018) 579–605