How to remove duplicates by recursion in a Python list

It is quite common for Python lists to have duplicates. For some particular use cases, a user might need to remove them.

The illustration below shows the visual representation of how to remove duplicates from the list.

Visual representation of removing duplicates from the list

We can easily remove duplicates through recursion, whether it be a list of strings, integers, floating-point numbers, or any other data type. In this shot, we will remove duplicates through recursion.

Code

def remove_duplicates_recursion (dupList, temp):
if len(dupList) == 0: #condition 0 --> base case
return dupList
if dupList[0] in temp: #condition 1
temp.append(dupList[0])
return remove_duplicates_recursion(dupList[1:], temp)
else: #condition 2
temp.append(dupList[0])
return [dupList[0]] + remove_duplicates_recursion(dupList[1:], temp)
def main():
nums = [1,1,2,2]
strings = ['hello', 'world', 'hello', 'world', 'hi']
emptyList = []
print("number list with duplicates", nums)
print("number list without duplicates", remove_duplicates_recursion(nums, emptyList))
print("strings list with duplicates", strings)
print("string list without duplicates", remove_duplicates_recursion(strings, emptyList))
main()

Explanation

The Python file below includes a function named remove_duplicates_recursion which takes two arguments: dupList and temp. dupList is the list possibly containing duplicate elements, and temp is initially an empty list.

The function first checks if the length of dupList is 1 or 0. If so, it would trivially imply that there is no possibility of any duplicates occurring.

Then it checks whether the first element of dupList, i.e. dupList[0], is present in the array temp.

  • If the element is present then it will be appended at the end of temp and the function will be recursively called without the first element of dupList.
  • If the element is not present in temp, it will be added to the list that the function has to return.

As the function proceeds recursively, it will not allow any element in temp to be stored in the list that it has to return, and temp will keep filling up with successive elements in dupList.

Example

For a simulation exercise, take the example of the array dupList = [1, 1, 2, 2].

Iteration 1

  • dupList = [1, 1, 2, 2]

  • dupList[0] = 1

  • temp = []

Since the length of dupList is >= 1, and duplist[0] is not in temp, the block under condition # 2 will execute. It will add the first element of dupList in the list to return and call the remove_duplicates_recursion function again with dupList = [1, 2, 2] and temp = [1].

Iteration 2

  • dupList = [1, 2, 2]

  • temp = [1]

  • returning list = [1]

dupList[0] = 1

Now, as the value present in dupList[0] is also present in temp, condition # 1 will execute. The value in dupList[0] will not be added to the returning list, but the function will be called with all elements in dupList except for duplist[0]. This ensures that if a duplicate is found, it will not be added to the returning list, as the function checks for more duplicates further down dupList.

These same processes will repeat for iteration no. 3 and iteration no. 4 when the function gets an empty dupList as input. The returning list inside the stack will be stored in the nums variable in main.

This code is general for all data types.

Free Resources