Python列表去重复项的N种方法( 二 )

## 10. 利用字典结合过滤来实现去重复 。def unique(item):if obj.get(item) == None:obj[item] = itemreturn Truereturn False# testdata = https://www.isolves.com/it/cxkf/yy/Python/2020-05-03/['a', 'a', 1, 1, 2, 2, 'b', 'b', 2, 1]start_time = time.time()obj = {}print("filter + dict + get:", filter(unique, data))print("time:" + str((time.time() - start_time) * 1000) + " ms")## 11. 利用map来实现去重复 。与map与filter类似,是一个高阶函数 。可以针对其中项逐个修改操作 。## 与filter不同map会保留原有项目,并不会删除,因此值可以改为None,然后再过滤掉 。def unique(item):if item not in newList:newList.append(item)return itemreturn None# testdata = https://www.isolves.com/it/cxkf/yy/Python/2020-05-03/['a', 'a', 1, 1, 2, 2, 'b', 'b', 2, 1]newList = []start_time = time.time()print("list from Map:", filter(lambda item: item != None, map(unique, data)))print("time:" + str((time.time() - start_time) * 1000) + " ms")## 12. 利用set数据结构里key的唯一性来去重复data = https://www.isolves.com/it/cxkf/yy/Python/2020-05-03/['a', 'a', 1, 1, 2, 2, 'b', 'b', 2, 1]print("from Set:", set(data))print("time:" + str((time.time() - start_time) * 1000) + " ms")## 13. 提前排序,从后向前遍历,将当前项与前一项对比,如果重复则移除当前项def unique(data):data.sort()l = len(data)while (l > 0):l -= 1if (data[l] == data[l - 1]):data.remove(data[l])return data# testdata = https://www.isolves.com/it/cxkf/yy/Python/2020-05-03/['a', 'a', 1, 1, 2, 2, 'b', 'b', 2, 1]start_time = time.time()print("sort + remove:", unique(data))print("time:" + str((time.time() - start_time) * 1000) + " ms")## 14. 提前排序,自前往后遍历,将当前项与后一项对比,如果重复则移除当前项def unique(data):"""in python 3: TypeError: '<' not supported between instances of 'int' and 'str'need to keep the same Type of member in List"""data.sort()l = len(data) - 1i = 0while i < l:if (data[i] == data[i + 1]):del data[i]i -= 1l -= 1i += 1return data# testdata = https://www.isolves.com/it/cxkf/yy/Python/2020-05-03/['a', 'a', 1, 1, 2, 2, 'b', 'b', 2, 1]start_time = time.time()print("sort+del ASE:", unique(data))print("time:" + str((time.time() - start_time) * 1000) + " ms")## 15. 利用reduce函数来去重复 。reduce具有累计的作用,判断如果不在累计结果中出现,则追加到结果中 。import functoolsdef unique(data):newList = []def foo(result, item):if isinstance(result, list) == False:result = [result]return result if item in result else result + [item]return functools.reduce(foo, data)# testdata = https://www.isolves.com/it/cxkf/yy/Python/2020-05-03/['a', 'a', 1, 1, 2, 2, 'b', 'b', 2, 1]start_time = time.time()print("functools.reduce:", unique(data))print("time:" + str((time.time() - start_time) * 1000) + " ms")## 16. 利用递归调用来去重复 。递归自后往前逐个调用,当长度为1时终止 。## 当后一项与前任一项相同说明有重复,则删除当前项 。相当于利用自我调用来替换循环def recursionUnique(data, len):if (len <= 1):return datal = lenlast = l - 1isRepeat = Falsewhile (l > 1):l -= 1if (data[last] == data[l - 1]):isRepeat = Truebreakif (isRepeat):del data[last]return recursionUnique(data, len - 1)# testdata = https://www.isolves.com/it/cxkf/yy/Python/2020-05-03/['a', 'a', 1, 1, 2, 2, 'b', 'b', 2, 1]start_time = time.time()print("recursionUnique:", recursionUnique(data, len(data)))print("time:" + str((time.time() - start_time) * 1000) + " ms")## 17. 利用递归调用来去重复的另外一种方式 。递归自后往前逐个调用,当长度为1时终止 。## 与上一个递归不同,这里将不重复的项目作为结果拼接起来def recursionUniqueNew(data, len):if (len <= 1):return datal = lenlast = l - 1isRepeat = Falsewhile (l > 1):l -= 1if (data[last] == data[l - 1]):isRepeat = Truebreakif (isRepeat):del data[last:]result = []else:result = [data[last]]return recursionUniqueNew(data, len - 1) + result# testdata = https://www.isolves.com/it/cxkf/yy/Python/2020-05-03/['a', 'a', 1, 1, 2, 2, 'b', 'b', 2, 1]start_time = time.time()print("recursionUniqueNew:", recursionUniqueNew(data, len(data)))print("time:" + str((time.time() - start_time) * 1000) + " ms")

Python列表去重复项的N种方法

文章插图
 
讨论从以上例子上可以看出,相对来讲,Python比起其它语言要灵活得多,与JS并列最流行的脚本类语言,这也就是为何Python如此流行的原因吧 。
哪一种方式更适合呢?你常用那种方式来实现去重复项?新建数组、非新建、借助DIct或Set等结构,亦或是其它方式?
Python列表去重复项的N种方法

文章插图
 


推荐阅读