When developed using Swift input pocketed,I met a very interesting question -Deduplication。
All to known,Candidate input method in the calculation will always be out there may be a duplicate options (such as code table and have a word in the lexicon,They may be different coding,But the word is the same and the like),This time you need to re,But while keeping the same order of candidates。
Others solutions
If you go online to find,Then you may find is this:
1 2 3 4 5 |
extension Array where Element : Hashable { var unique: [Element] { return Array(Set(self)) } } |
source:https://stackoverflow.com/questions/27624331/unique-values-of-array-in-swift
Such:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
extension Array where Element: Hashable { func removingDuplicates() -> [Element] { var addedDict = [Element: Bool]() return filter { addedDict.updateValue(true, forKey: $0) == nil } } mutating func removeDuplicates() { self = self.removingDuplicates() } } |
source:https://www.hackingwithswift.com/example-code/language/how-to-remove-duplicate-items-from-an-array
Or is this:
1 2 3 4 5 6 7 8 9 10 11 |
extension Array where Element: Equatable { mutating func removeDuplicates() { var result = [Element]() for value in self { if !result.contains(value) { result.append(value) } } self = result } } |
source:https://medium.com/if-let-swift-programming/swift-array-removing-duplicate-elements-128a9d0ab8be
Problem
Let's start with the first code block above,He spent directly Set ,All to known,This thing and dictionaries ( Dictionary ) A kind,in realityNo orderof,So you put Array Converted to Set And then converted back to Array To really achieve the purpose of weight,But it is likely to make the original order of disorder。
In the Swift, Set existTo some extentIt is able to maintain the order,For example, the example above is actually the source page is to keep the original order,This can only be acoincidence,Because it doesdo not guaranteeNothing is original sequence。
Achieve the second block of code actually has been very good,In fact, it is a way to realize I was writing this article when serendipity,Use is Array of filter And a method of use Dictionary Filtering,It takes a little bit of extra space,But nothing wrong。
The third problem is that the code blocks directly Array of contains Method to determine the presence of,This is a very bad way,In fact, IA quick look at the stringThe article had discussed,Almost all Swift contains They are not the same type of index or range To the quick,Moreover, Array The search speed is O(n)。
More high-end solutions
As a matter of fact,Do you have more advanced solutions,Such as the legendary "Ordered dictionary”,However, it is regrettable that the algorithm Swift basis of the framework does not give such high-end,For example, aRed-black treeachieve。but,I found a fewRed-black treeAfter comparing,In fact this advanced algorithms more slowly (perhaps because of my specific scenario can not reflect the advantages of advanced algorithms,For example, the number is not enough need to go heavy? )
Ultimately
In short,I finally realized my experience my own version,In fact earlier in the second block of code (that dictionary implementation) has been very good,but,Here is my code posted,And discuss them:
1 2 3 4 5 6 7 8 9 |
extension Array where Element:Hashable { var unique:[Element] { var uniq = Set<Element>() uniq.reserveCapacity(self.count) return self.filter { return uniq.insert($0).inserted } } } |
My implementation,used Set As the filter,Because it is mainly doing that :)
It was also said that Set actually achieve red-black tree,Here I want to make a question mark,Swift Set in the specific algorithm is what I have not found,But it does not seem to red-black tree。
There are two important places,first of all .reserveCapacity ,In fact, for Swift is, Array 、 Dictionary 、 Set This collection space is a type of dynamic allocation,Most of the time you do not need to manually set the size of its capacity,But here we speed pursuit,So I set it manually Array length,Because the aim here is to re,So the absolute maximum lengthwill notExceeds the size of the original data,This avoids Set Add an element to the process leading to heavy expansion,The expansion process,There are actually a lot of time cost。
Another point is Set of .insert ,Usually we will ignore .insert The return value,After all, it is to include a statement @discardableResult of,But if you look closely,You will find this method returns a tuple,Type is (inserted: Bool, memberAfterInsert: Set<Element>.Element) ,It tells you,Whether this element inserted successfully,And this element itself - that is,If there is already,Then insert failed。
Such,We get a quick、Streamlined Swift unique Technique。
discuss
As a matter of fact,At first glance I realize this seems to have no great place,after all,That is, a set of Bale,Then,If that is the direct implementation,Then most people may think of similar,For example, this:
1 2 3 4 5 6 7 8 9 10 11 12 |
var unique:[Element] { var uniq = Set<Element>() var result:[Element] = [] for v in self { guard uniq.firstIndex(of: v) == nil else {continue} uniq.insert(v) result.append(v) } return result } |
This tedious version be implemented on top,But it is very intuitive,Here I use firstIndex(of: To determine the presence of,Than contains A little bit faster,How specific effect? Let's use this code to check:
1 2 3 4 5 6 7 |
let data = [1,2,3,4,5,6,7,8,9,0,9,8,7,6,5,4,3,2,1,4,2,3,5,6,7,45,3,43,99,687,8,5,3] var date = Date() for _ in 0..<9999 { let a = data.unique } print(Date().timeIntervalSince(date)*1000) |
We went for a re-write of an array of handy 9999 Secondary,calculating time,The results obtained are units of milliseconds (ms): 167.54305362701416
It looks pretty fast,right? Then we use the same test code,Try to achieve the above-mentioned dictionary by weight,I put the code paste over:
1 2 3 4 5 6 |
var unique:[Element] { var addedDict = [Element: Bool]() return filter { addedDict.updateValue(true, forKey: $0) == nil } } |
The same deduplication 9999 Secondary,Get time: 142.61806011199951 It looks good,Soon 25 millisecond!
At last,I realized:
1 2 3 4 5 6 7 |
var unique:[Element] { var uniq = Set<Element>() uniq.reserveCapacity(self.count) return self.filter { return uniq.insert($0).inserted } } |
result: 103.64902019500732 ms
of course,This discussion is a prerequisite limited,For example, the amount of data will not be great。There is the main programming language for Swift,Change to other languages,Perhaps it will be more convenient and more efficient algorithm perhaps (mostly Swift in the pit too ......),Another,For milliseconds before a mistake,It has now been corrected,Super awkward。
Original article written by LogStudio:R0uter's Blog » Swift in the array to re-program
Reproduced Please keep the source and description link:https://www.logcg.com/archives/3177.html