It is quite common for Python lists to have duplicates. For some particular use cases, a user might need to remove them.
The illustration below shows the visual representation of how to remove duplicates from the list.
We can easily remove duplicates through recursion, whether it be a list of strings, integers, floating-point numbers, or any other data type. In this shot, we will remove duplicates through recursion.
def remove_duplicates_recursion (dupList, temp):if len(dupList) == 0: #condition 0 --> base casereturn dupListif dupList[0] in temp: #condition 1temp.append(dupList[0])return remove_duplicates_recursion(dupList[1:], temp)else: #condition 2temp.append(dupList[0])return [dupList[0]] + remove_duplicates_recursion(dupList[1:], temp)def main():nums = [1,1,2,2]strings = ['hello', 'world', 'hello', 'world', 'hi']emptyList = []print("number list with duplicates", nums)print("number list without duplicates", remove_duplicates_recursion(nums, emptyList))print("strings list with duplicates", strings)print("string list without duplicates", remove_duplicates_recursion(strings, emptyList))main()
The Python file below includes a function named remove_duplicates_recursion
which takes two arguments: dupList
and temp
. dupList
is the list possibly containing duplicate elements, and temp
is initially an empty list.
The function first checks if the length of dupList
is 1
or 0
. If so, it would trivially imply that there is no possibility of any duplicates occurring.
Then it checks whether the first element of dupList
, i.e. dupList[0]
, is present in the array temp
.
temp
and the function will be recursively called without the first element of dupList
.temp
, it will be added to the list that the function has to return.As the function proceeds recursively, it will not allow any element in temp
to be stored in the list that it has to return, and temp
will keep filling up with successive elements in dupList
.
For a simulation exercise, take the example of the array dupList = [1, 1, 2, 2]
.
dupList = [1, 1, 2, 2]
dupList[0] = 1
temp = []
Since the length of dupList
is >= 1, and duplist[0]
is not in temp
, the block under condition # 2
will execute. It will add the first element of dupList
in the list to return and call the remove_duplicates_recursion
function again with dupList = [1, 2, 2]
and temp = [1]
.
dupList = [1, 2, 2]
temp = [1]
returning list = [1]
dupList[0] = 1
Now, as the value present in dupList[0]
is also present in temp
, condition # 1
will execute. The value in dupList[0]
will not be added to the returning list, but the function will be called with all elements in dupList
except for duplist[0]
. This ensures that if a duplicate is found, it will not be added to the returning list, as the function checks for more duplicates further down dupList
.
These same processes will repeat for iteration no. 3 and iteration no. 4 when the function gets an empty dupList
as input. The returning list inside the stack will be stored in the nums
variable in main.
This code is general for all data types.