CS273: Regular Languages Review
This document describes expectations of what knowledge and skills
you should have about regular languages - the first third of the course.
To prepare for the exam, you should make sure that you have done all relevant
reading assigned, gone through all homework problems and solutions, all
HBS problems, and viewed all lecture notes. We have attempted here to
give a reasonable high-level summary of concepts and skills, but the
"official" representation of what has been covered is the union of
of all material mentioned in the above sources, together with this document
and any additional material presented in lectures.
This document may be revised as the course staff looks it over to provide
our best effort at providing you with useful and complete information.
Any changes after 6pm Saturday 2/25 will be documented somehow.
- Preliminaries
- Induction
-
Know how to do inductive proofs, in particular, "structural" induction.
- Countable and Uncountable
- Understand the definition of countably and uncountably
infinite sets.
- Understand that one can prove a set countably infinite by
giving a method to list all elements (perhaps with repetition),
or by giving a bijection between the set and the natural numbers.
- Know which sets of numbers (natural numbers, rationals, reals, etc.)
are countably infinite, and which are not.
- Know how to prove a set uncountably infinite using Cantor's
diagonalization method, or by giving a bijection with a known
uncountably infinite set.
- Given the description of a set, prove whether it is finite,
countably infinite or uncountably infinite.
- Prove that there are uncomputable functions via a counting argument:
First arguing that the number of "algorithms" is only countably infinite.
Next arguing that the number of functions, say, from N to {0,1} is
uncountably infinite.
- Alphabets, Strings, Languages
- Definitions of alphabet, string or word, language (set)
- Be able to describe various languages of strings in set notation.
- Understand the effect of concatenation, *, +, homomorphism,
as operators on strings and as operators on languages.
- Be able to understand and to create recursive/inductive
definitions of sets of strings
- Be able to do proofs about strings and languages
by induction on the lengths of strings.
- DFAs
- Definition of DFA as directed labeled graph, or as 5-tuple.
- Understanding of transition function d (delta) and property that
d(q,xy) = d(d(q,x),y)) for all states q and strings x, y.
- Ability to design a DFA to meet a given specification.
- Understand the essence of DFAs: Finite memory. States capture
all and only that which needs to be remembered to allow the
computation to proceed. Intuition: counting is not possible.
Counting mod n, for some fixed constant n, is possible.
A DFA can keep track information that does not grow as the input
string grows.
- Ability to describe the language accepted by a given DFA
- Using "cross-product" and component-tuples of states
(of form (q1,q2)) to allow one DFA to simulate
two or more DFAs simultaneously. (As an example, proof that
regular languages are closed under intersection.)
- Proof of properties or behavior of given DFAs, by induction on length
of string w being processed, and "unrolling" transition function one
step to allow application of inductive hypothesis.
- "Transducers": DFAs with output, either given at state, or along
edges as transition is taken.
- NFAs
- Understand basic definition of NFA, both in terms of graphical model,
and as tuple.
- An NFA accepts by definition if there is at least one accepting
computation.
- Be able to take advantage of nondeterminism by designing a simple NFA
where a DFA would be much more complex. (In particular, where
a union of alternatives is desired.)
- Be able to design an NFA to match any of a collection of keywords.
- Understand use of epsilon edges and the convenience they offer.
- Be able to eliminate epsilon edges from an NFA, and know the
time complexity to do that.
- Be able to create a DFA that is equivalent to a given NFA
by using the subsets-of-states construction. Understand not only
how to do the construction, but why it works. Be able to describe
the time complexity of this process.
- Be able to give examples where creating an equivalent
DFA requires exponentially
many more states than a given NFA. Be able to give examples
where the number of states in this construction does not
increase.
- Be able to design NFAs to show various closure properties of
regular languages as needed. For example, closure under intersection
or union via the cross-product construction, or closure under reversal
via creation of an NFA from a DFA by reversing edges, creating a new
start state, ...
- Regular expressions
- Recognize valid and invalid regular expressions
- Be able to give the inductive definition of regular expressions
- Understand the precedence of the operators in a regular expression
(*, concatenation, +).
- Given a regular expression r, describe the language L(r) it denotes.
- Given the definition of a regular language L, construct a regular
expression r such that L(r) = L.
- Know basic algebraic identities on regular expressions, and be able to
use them to simplify regular expressions.
- Given a DFA or NFA M, be able to produce a regular expression r
such that L(r) = L(M), by using the state elimination method.
- Given a DFA or NFA M, be able to produce a regular expression r
such that L(r) = L(M) by simply reasoning about L(M), and constructing
r in an ad hoc manner.
- Given a regular expression r, be able to produce an NFA M such that
L(M) = L(r), inductively on the structure of r.
- Given a regular expression r, be able to produce an NFA M such that
L(M) = L(r) by simply reasoning about L(r), and constructing M in an
ad hoc manner.
- Be able to prove various properties and facts about regular expressions
via induction on the structure of the expression.
- Know the time complexity of transforming between alternative
representations (DFAs, NFAs, regular expressions).
- Pumping Lemma
- Be able to state the pumping lemma for regular languages
- Be able to prove the pumping lemma. This really involves
just understanding that if a string w is accepted by some M and
w is longer than the number of states, then because w must "loop"
back to some state in M en route to accept, we can eliminate or
repeat some segment of w to obtain new accepted words.
- Be able to use the pumping lemma to prove that a given language
is not regular.
- Make sure your argument is coherent, and has all the "glue"
so that it reads like a proof.
- Make sure that your argument does not assume some particular n,
or some particular x, y, and z. That is, know that you do not get
to choose n or x,y,z, but that you do get to choose w and i.
- Be familiar with the various types of pumping lemma arguments -
some based on matched characters, some based on lengths of
strings.
- Don't mistakenly think that the pumping lemma can be used to
prove that a language is regular.
- Closure properties of Regular Languages
- Be able to prove that regular languages are closed under
the following operations:
- Union
- Concatenation
- *
- Complement
- Intersection
- Homomorphism
- Substitution
- Reversal
- Miscellaneous other operations that you may have seen
- Be familiar with various techniques to show closure properties
of regular languages: By constructing DFAs or NFAs that depend on given
DFAs or NFAs, by constructing regular expressions, by arguing set-theoretically
(e.g., if L=L1-L2 is regular if
L1 and L2 are, because this is just
L1 intersected with the complement of L2, and regular
lanuages are closed under intersection and complement).
- Be able to use closure properties to prove that a language L is
NOT regular: By showing how if L were regular, then (applying closure
properties), we could obtain a known nonregular language L' from L (and perhaps
other regular languages). Since L' is known nonregular, L couldn't have been.
- DFA Minimization and Equivalence
- Be able to run the DFA minimization algorithm on a given DFA,
and obtain distinguishing strings for pairs of distinguishable states.
- Be able to apply the minimization algorithm to determine if two
DFAs accept the same language, by determining whether their start states
are distinguishable.
- Understand the key ideas in the minimization procedure:
- The definition of what it means for two states to be distinguishable.
- What a distinguishing string for two states is.
- If p and q are distinguishable states in DFA M, and strings
x and y lead to p and q respectively, be able to explain why any machine
accepting L(M) (even those that look nothing like M) must
send p and q to different states.
- Be able to argue that indistinguishability is an equivalence relation.
- Be able to argue that if all states of M are reachable, and
the minimization procedure has been applied to M, why there cannot
exist a DFA with fewer states.
- Be able to describe the time complexity of the DFA minimization
procedure.
- Recognize that minimization does not work for NFAs.
- Decision algorithms for Regular Languages
These items were covered in section on Thursday 2/23,
and will not be covered by the first exam.
- Be able to describe algorithms (and loosely, the time complexity)
for the following tasks:
- Given a DFA M, is L(M) empty?
- Given an NFA M, is L(M) empty?
- Given a regular expression r, is L(r) empty?
- Given a DFA M, is L(M) = Sigma*?
- Given two DFAs M1 and M2, determine whether
or not L(M1) = L(M2) using a method other
than minimization.
- Given a DFA or NFA M, determine whether or not L(M) is infinite.
- Miscellaneous other questions about behavior of DFAs and NFAs