What are regular languages?

A language is a regular language if and only if a finite state machine recognizes it. To understand this concept better, we need to understand what a finite state machine is.

Finite state machine

A finite state machine, also known as finite automata, is the simplest form of computation and has limited memory. Let’s have a look at some of the prerequisites of the finite state machine:

  • Symbol: Any character or element can be considered a symbol, for example, a,b,c,0,1,2,3,...a, b, c, 0, 1, 2, 3, ....

  • Alphabet: An alphabet is denoted by ΣΣ (sigma) and is a collection of symbols, for example, {a,b}\{a,b\}, {d,e,f,g}\{d,e,f,g\}, {0,1,2}\{0,1,2\}and so on.

  • String: A sequence of symbols is known as a string, for example, aaaa, bbbb, abaaba, a00ba00b, and so on.

  • Language: A set of strings is known as a language. Let’s look at the following examples:

Let us say that we have the alphabet Σ={0,1}Σ = \{0,1\}.

We have language L1L1 , where L1=set of all strings with length 2L1 = set~of~all~strings~with~length~2.

The possible strings for language L1L1over the alphabet ΣΣ are as follows:

={00,01,10,11}=\{00, 01, 10, 11\}.

We have language L2L2 , where L2=set of all strings with length 3L2 = set~of~all~strings~with~length~3.

The possible strings for language L2L2over the alphabet ΣΣ are as follows:

={000,001,010,011,100,101,110,111}=\{000, 001, 010, 011, 100, 101, 110, 111\}.

We have language L3L3 , where L3=set of all strings starting with 0L3 = set~of~all~strings~starting~with~0.

The possible strings for language L3L3 over the alphabet ΣΣ are as follows:

={0,00,01,000,001,010,011,0000,...}=\{0, 00, 01, 000, 001, 010, 011, 0000, ...\}.

The languages L1L1 and L2L2 are finite sets because they have a finite number of elements, whereas the language L3L3 is an infinite set.

Powers of Σ

Let’s say that we have the alphabet Σ={0,1}Σ = \{0,1\}:

Σ0=Σ ^0=Set of all strings with length 00: Σ0={ε}Σ ^0=\{ε\} (cardinality = 11)

Σ1=Σ ^1=Set of all strings with length 11: Σ1={0,1}Σ ^1=\{0, 1\} (cardinality = 22)

Σ2=Σ ^2=Set of all strings with length 22: Σ2={00,01,10,11}Σ ^2=\{00, 01, 10, 11\} (cardinality = 44)

Σ3=Σ ^3=Set of all strings with length 33: Σ3={000,001,010,011,100,101,110,111}Σ ^3= \{000, 001, 010, 011, 100, 101, 110, 111\} (cardinality = 88)

......

Σn=Σ ^n=Set of all strings with length nn (cardinality = 2n2^nfor elements more than 2)

A special case where the power of ΣΣ is denoted by a ^*, which is ΣΣ ^*, the star can be replaced by:

Σ=Σ0Σ1Σ2Σ3...Σ ^*=Σ ^0 ∪ Σ ^1 ∪ Σ ^2 ∪ Σ ^3 ...

      ={ε}{0,1}{00,01,10,11}{000,001,010,011,100,101,110,111}...~~~~~~= \{ε\} ∪ \{0,1\} ∪ \{00,01,10,11\} ∪ \{000, 001, 010, 011, 100, 101, 110, 111\} ...

      =Set of all possible strings of all lengths over the alphabet Σ={0,1}~~~~~~= Set~of~all~possible~strings~of~all~lengths~over~the~alphabet~Σ = \{0,1\}.

Now, let’s look at what finite state machines are.

Hierarchical display of finite state machines
Hierarchical display of finite state machines
  • States: A finite state machine has a finite set of states represented by the notation Q={q1,q2,...,qn}Q = \{q1, q2,..., qn\}. Each state represents a particular setting or arrangement of the system under modeling. There can only be one state of the machine at once.

The machine can be defined by using the following five tuples:

Q=Set of all statesQ = Set~of~all~states

Σ =InputsΣ  = Inputs

q0=Start stateq_0 = Start~state

F=Set of final statesF = Set~of~final~states

δ=Transition function from Q×ΣQδ = Transition~function~from~Q × Σ → Q

  • Transitions: Based on input events or circumstances, transitions—represented by arrows or edges—define the permitted state transitions. Transitions show how the machine changes states in response to particular inputs.

  • Input alphabet: A finite set of symbols or events that can cause state transitions serves as the input alphabet for the machine. The machine reads one input symbol at a time, which then responds by switching between states.

  • Transition function: Using the transition function, δ:Q×ΣQδ: Q × Σ → Q, we can determine how the machine reacts to inputs. The following state to which the machine should transition is specified for the current state, qeqe, and input symbol,σσ, by δ(q,σ)δ(qₑ, σ). The machines’s behavior is encapsulated in this function. The output is the next state to which the machine will transition.

  • Start state: The machine starts at one state, known as the start state (q0)(q0). It symbolizes the system’s starting point.

  • Accept states: Some state(s) may be designated as accept state(s) or final state(s). These states denote the completion of a particular pattern. Accept states are frequently used by finite state machines that recognize particular languages or patterns, such as regular expressions.

  • Output (optional): Based on the output behavior, finite state machines can be divided into two categories:

    • Moore machine: Only the current state can determine the output. Each state is connected to a predetermined output.

    • Mealy machine: The input as well as the output are dependent on one another. There may be associated output values for transitions.

Let’s take an example of a finite state machine:

Finite state machine A
Finite state machine A

Q={A,B,C,D}Q = \{A, B, C, D\}

Σ ={0,1}Σ  = \{0,1\}

q0=Aq_0 = A

F={D}F = \{D\}

δ=Transition function from Q×ΣQδ = Transition~function~from~Q × Σ → Q

Transition table for finite state machine A
Transition table for finite state machine A

Regular languages

As stated earlier, a language is said to be a regular language if and only if a finite state machine recognizes it.

So, what languages are not regular?

  • A language which is not recognized by a finite state machine

  • A language that requires memory

Why is that? This is because the memory of a finite-state machine is very limited. It only stores the current state and it cannot store or count strings. This makes such a language "not regular."

Let’s have a look at some examples:

  • Language: ababbababbababbababb following the rule that the first five letters 'ababbababb' should be repeated. The machine will have to keep track of the five letters that need to be repeated and store them, so this requires extra memory

  • Language: aNbNa^Nb^N following the rule that the number of bs must be equal to the number of as. Let’s see if a finite-state machine can be designed for this example. The machine will have to keep track of the count of as and then will need to keep track of the count of bs to see if they are equal, so this requires extra memory.

Therefore, by counter-example, we now know what regular languages are.

Quiz yourself

Test your knowledge of the regular languages.

1

What is a finite state machine (FSM)?

A)

A machine with unlimited memory.

B)

A machine that can recognize any language.

C)

A simple computational model with limited memory.

D)

A machine used only for graphics rendering.

Question 1 of 50 attempted

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved