e****o 发帖数: 76 | 1 假设一个Vector存了一些CLASS A 的objects。A 有一个member int b。
如果想把Vector 里所有A的objects that has b==5 remove 掉。 是用for loop还是
用 enumeration? 如果用enumeration, 怎样把object remove 掉, 不能用 iterator
。 |
g*****g 发帖数: 34805 | 2 you create a new Vector, iterate and store the reference of Class A
which have b==5. Then do a removeAll
iterator
【在 e****o 的大作中提到】 : 假设一个Vector存了一些CLASS A 的objects。A 有一个member int b。 : 如果想把Vector 里所有A的objects that has b==5 remove 掉。 是用for loop还是 : 用 enumeration? 如果用enumeration, 怎样把object remove 掉, 不能用 iterator : 。
|
e****o 发帖数: 76 | 3 removeAll from newly create vector wont affect the size of the old vector,
even though those (b==5) objects are null now.
【在 g*****g 的大作中提到】 : you create a new Vector, iterate and store the reference of Class A : which have b==5. Then do a removeAll : : iterator
|
A**o 发帖数: 1550 | 4 one way i was using before i knew i could use iterator
was do the reserve loop, ie. loop down. then it's ok to remove from list.
iterator
【在 e****o 的大作中提到】 : 假设一个Vector存了一些CLASS A 的objects。A 有一个member int b。 : 如果想把Vector 里所有A的objects that has b==5 remove 掉。 是用for loop还是 : 用 enumeration? 如果用enumeration, 怎样把object remove 掉, 不能用 iterator : 。
|
e****o 发帖数: 76 | 5 yes, reverse loop might be the correct answer. So enumeration can not work.
【在 A**o 的大作中提到】 : one way i was using before i knew i could use iterator : was do the reserve loop, ie. loop down. then it's ok to remove from list. : : iterator
|
A**o 发帖数: 1550 | 6 not much exp on enumeration. can't answer.
【在 e****o 的大作中提到】 : yes, reverse loop might be the correct answer. So enumeration can not work.
|
c*****t 发帖数: 1879 | 7 For vector, you can just do
for (int i = 0; i < vec.size ();)
{
if (((A)vec.get (i)).b == 5)
vec.remove (i);
else
++i;
}
However, it is not as efficient as getting the ones that do not match,
and then set it back.
iterator
【在 e****o 的大作中提到】 : 假设一个Vector存了一些CLASS A 的objects。A 有一个member int b。 : 如果想把Vector 里所有A的objects that has b==5 remove 掉。 是用for loop还是 : 用 enumeration? 如果用enumeration, 怎样把object remove 掉, 不能用 iterator : 。
|
e****o 发帖数: 76 | 8 One thing I worry about this solution is once you delete one from vec, it
size changes, you might not continue to use the size you got last time.
【在 c*****t 的大作中提到】 : For vector, you can just do : for (int i = 0; i < vec.size ();) : { : if (((A)vec.get (i)).b == 5) : vec.remove (i); : else : ++i; : } : However, it is not as efficient as getting the ones that do not match, : and then set it back.
|
c*****t 发帖数: 1879 | 9 You need to think again.
【在 e****o 的大作中提到】 : One thing I worry about this solution is once you delete one from vec, it : size changes, you might not continue to use the size you got last time.
|
m******t 发帖数: 2416 | 10
I don't understand. How does
theOriginal.removeAll(thoseBeGone);
not work?
【在 e****o 的大作中提到】 : removeAll from newly create vector wont affect the size of the old vector, : even though those (b==5) objects are null now.
|
|
|
g*****g 发帖数: 34805 | 11 想想可能还是把符合条件的放到一个temp vector里,
然后让original vector指向temp vector最快。
removeAll还是扫了两次。
【在 m******t 的大作中提到】 : : I don't understand. How does : theOriginal.removeAll(thoseBeGone); : not work?
|
m******t 发帖数: 2416 | 12
You mean those that need to be kept? Sure.
It's not always possible though, if theOriginal is not just a local
reference.
【在 g*****g 的大作中提到】 : 想想可能还是把符合条件的放到一个temp vector里, : 然后让original vector指向temp vector最快。 : removeAll还是扫了两次。
|
F****n 发帖数: 3271 | 13 你这个Work但每次都要调用一遍vec.size(),
而且remove会有不必要的shift
【在 c*****t 的大作中提到】 : For vector, you can just do : for (int i = 0; i < vec.size ();) : { : if (((A)vec.get (i)).b == 5) : vec.remove (i); : else : ++i; : } : However, it is not as efficient as getting the ones that do not match, : and then set it back.
|
F****n 发帖数: 3271 | 14 最好的方法还是LOOP DOWN
for (int i = vector.size() - 1; i >= 0; i--)
【在 m******t 的大作中提到】 : : You mean those that need to be kept? Sure. : It's not always possible though, if theOriginal is not just a local : reference.
|
m******t 发帖数: 2416 | 15
I agree.
【在 F****n 的大作中提到】 : 最好的方法还是LOOP DOWN : for (int i = vector.size() - 1; i >= 0; i--)
|
g*****g 发帖数: 34805 | 16 I don't, performance can be very bad.
Assume every object other than final one is a match.
Then you may resize the array a lot.
【在 m******t 的大作中提到】 : : I agree.
|
m******t 发帖数: 2416 | 17
No array copying is triggered when it is the last
element being deleted.
At least that's how the Sun implementation does it.
【在 g*****g 的大作中提到】 : I don't, performance can be very bad. : Assume every object other than final one is a match. : Then you may resize the array a lot.
|
g*****g 发帖数: 34805 | 18 I am not talking about the last, but last 2nd.
That's a lot of copying needed. For a long array,
there might be automatic resizing too.
Of course, most of time it doesn't matter.
【在 m******t 的大作中提到】 : : No array copying is triggered when it is the last : element being deleted. : At least that's how the Sun implementation does it.
|
m******t 发帖数: 2416 | 19
Right, I see what you mean now.
So out of n elements, let's say if a[0] through a[n-2]
all need to be deleted. It's always the "last 2nd".
So we are looking at n-1 arraycopy calls, but each call
only moves 1 element. So it's still O(n).
On the other hand, if _only_ a[0] needs to go, then
it's 1 arraycopy call moving n-1 elements. That's O(n) as well.
So there is always some kind of tradeoff here - the more
"goners" we have, the more arraycopy calls, but the fewer
elements actually copied each
【在 g*****g 的大作中提到】 : I am not talking about the last, but last 2nd. : That's a lot of copying needed. For a long array, : there might be automatic resizing too. : Of course, most of time it doesn't matter.
|
g*****g 发帖数: 34805 | 20 No, if the first half needs to be deleted, the 2nd half not.
it would be N/2 * N/2 => O(N^2)
【在 m******t 的大作中提到】 : : Right, I see what you mean now. : So out of n elements, let's say if a[0] through a[n-2] : all need to be deleted. It's always the "last 2nd". : So we are looking at n-1 arraycopy calls, but each call : only moves 1 element. So it's still O(n). : On the other hand, if _only_ a[0] needs to go, then : it's 1 arraycopy call moving n-1 elements. That's O(n) as well. : So there is always some kind of tradeoff here - the more : "goners" we have, the more arraycopy calls, but the fewer
|
|
|
F****n 发帖数: 3271 | 21 其实你们都对,如果是array不考虑内存肯定是另开一个好
既然是Vector那么没有特殊说明当然要保持原来的OBJECT。
象这种Array-based List Insert / Remove效率本来就低
如果要经常modify就不要用ArrayList, Vector, 用HashSet
或者commons.collections.*
【在 g*****g 的大作中提到】 : No, if the first half needs to be deleted, the 2nd half not. : it would be N/2 * N/2 => O(N^2)
|
c*****t 发帖数: 1879 | 22 其实,一次 remove 多个的话,array / vector 还是能达到 O (n) 。
就是将保留的先放到另外的 array 里,最后再弄回来。linkedlist
占的 memory 多,而且从一个 node 走到下一个也慢些。iterator
又会增加点 overhead (主要是 Java。C++ 都是 inline)。所以总
的来说,vector / array 并不慢。
至于自动 shrink,一般没有哪个 implementation 会这样。因为这
属于 predicting behavior。写该 container 的神奇的知道
pop_backup N 次以后不会又 push_back N 次?这纯粹属于没事找事。
array / vector 最大的问题主要是在 insertion 。
【在 F****n 的大作中提到】 : 其实你们都对,如果是array不考虑内存肯定是另开一个好 : 既然是Vector那么没有特殊说明当然要保持原来的OBJECT。 : 象这种Array-based List Insert / Remove效率本来就低 : 如果要经常modify就不要用ArrayList, Vector, 用HashSet : 或者commons.collections.*
|
F****n 发帖数: 3271 | 23 Insert / Remove还是慢,O(n)就已经是慢的了,
特别在需要经常INSERT / REMOVE的时候。Vector / array主要是iteration快
如果要经常修改,不要用VECTOR/ARRAY,如果不经常修改,经常ITERATE,用。
【在 c*****t 的大作中提到】 : 其实,一次 remove 多个的话,array / vector 还是能达到 O (n) 。 : 就是将保留的先放到另外的 array 里,最后再弄回来。linkedlist : 占的 memory 多,而且从一个 node 走到下一个也慢些。iterator : 又会增加点 overhead (主要是 Java。C++ 都是 inline)。所以总 : 的来说,vector / array 并不慢。 : 至于自动 shrink,一般没有哪个 implementation 会这样。因为这 : 属于 predicting behavior。写该 container 的神奇的知道 : pop_backup N 次以后不会又 push_back N 次?这纯粹属于没事找事。 : array / vector 最大的问题主要是在 insertion 。
|
c*****t 发帖数: 1879 | 24 LinkedList 寻找也需要 O (n)。而且通过 iterator 走到下一个 node 有
很多的 overhead 。教科书上的 O (1) deletion 是指你已经找到该 node
的情况下。相比之下 vector / array 里一个个的走不需要 iterator,
直接上 index,可以快很多。而且 vector / array 占的 memory 少,写
code 也不需要很多判断或者 call function。
所以综合来讲,如果 insert 较少,即使有需要多项 object 的 remove,
array / vector 还是比 linkedlist 强很多,不管是 performance 还是
easy to code 。
【在 F****n 的大作中提到】 : Insert / Remove还是慢,O(n)就已经是慢的了, : 特别在需要经常INSERT / REMOVE的时候。Vector / array主要是iteration快 : 如果要经常修改,不要用VECTOR/ARRAY,如果不经常修改,经常ITERATE,用。
|
F****n 发帖数: 3271 | 25 我从来没说用LinkedList, 不知道为什么把LinkedList拿出来说。
【在 c*****t 的大作中提到】 : LinkedList 寻找也需要 O (n)。而且通过 iterator 走到下一个 node 有 : 很多的 overhead 。教科书上的 O (1) deletion 是指你已经找到该 node : 的情况下。相比之下 vector / array 里一个个的走不需要 iterator, : 直接上 index,可以快很多。而且 vector / array 占的 memory 少,写 : code 也不需要很多判断或者 call function。 : 所以综合来讲,如果 insert 较少,即使有需要多项 object 的 remove, : array / vector 还是比 linkedlist 强很多,不管是 performance 还是 : easy to code 。
|
c*****t 发帖数: 1879 | 26 你说不用 vector / array 。那么剩下用什么?hehe
【在 F****n 的大作中提到】 : 我从来没说用LinkedList, 不知道为什么把LinkedList拿出来说。
|
F****n 发帖数: 3271 | 27 I think I mentioned using HashSet or commons collection.
Advantages of different collections:
HashSet: contains / remove / hot remove
ArrayList/Vector: iteration / index
LinkedList: hot remove (remove by iterator)
If you want to have both HashSet and ArrayList's strength, you have to
combine them. org.apache.commons.collections.* already implement those
collections.
【在 c*****t 的大作中提到】 : 你说不用 vector / array 。那么剩下用什么?hehe
|
c*****t 发帖数: 1879 | 28 There are many subtle issues with HashSet. Just because you are
using the pre-defined classes which "contained" the problem and
hide it from the eye, it doesn't mean they don't exist. Hash
collision for instance is a major problem, and the associated
item removal is another. They aren't better than simple array /
vector.
Database, for example, shown hash indexing isn't necessary better
than b+tree indexing:
http://archives.postgresql.org/pgsql-general/2005-05/msg00376.php
and
http://www.postg
【在 F****n 的大作中提到】 : I think I mentioned using HashSet or commons collection. : Advantages of different collections: : HashSet: contains / remove / hot remove : ArrayList/Vector: iteration / index : LinkedList: hot remove (remove by iterator) : If you want to have both HashSet and ArrayList's strength, you have to : combine them. org.apache.commons.collections.* already implement those : collections.
|
F****n 发帖数: 3271 | 29 It has nothing to do with this topic.
【在 c*****t 的大作中提到】 : There are many subtle issues with HashSet. Just because you are : using the pre-defined classes which "contained" the problem and : hide it from the eye, it doesn't mean they don't exist. Hash : collision for instance is a major problem, and the associated : item removal is another. They aren't better than simple array / : vector. : Database, for example, shown hash indexing isn't necessary better : than b+tree indexing: : http://archives.postgresql.org/pgsql-general/2005-05/msg00376.php : and
|
c*****t 发帖数: 1879 | 30 I was just saying that your understanding that (hashset is good for
hot removal) is wrong. You brought up hashset first, not me. Obviously,
if there are multiple objects fit the filter, it is either not a
set or there are not advantages using the hashset.
All I was trying to point out in the previous posts were that if
you have to do sequential search anyways to remove multiple objects
array / vector are a good choice due to very low overhead.
【在 F****n 的大作中提到】 : It has nothing to do with this topic.
|
|
|
F****n 发帖数: 3271 | 31 如果有collision Hash表当然会慢,但不能因此就说他慢于ARRAY,
算法分析的基本原则就是不能拿A的最坏情况跟B的最好情况比
【在 c*****t 的大作中提到】 : I was just saying that your understanding that (hashset is good for : hot removal) is wrong. You brought up hashset first, not me. Obviously, : if there are multiple objects fit the filter, it is either not a : set or there are not advantages using the hashset. : All I was trying to point out in the previous posts were that if : you have to do sequential search anyways to remove multiple objects : array / vector are a good choice due to very low overhead.
|
m******t 发帖数: 2416 | 32
Nah, bug was right as far as the point at discussion
- which is the "count down with in-place shift" approach.
I forgot that the performance wasn't necessarily monotonic.
I still do prefer this approach over having another
toBeDeleted collection though, because it's simple and
easy to read, and not really significantly slower unless
N is huge.
【在 F****n 的大作中提到】 : 其实你们都对,如果是array不考虑内存肯定是另开一个好 : 既然是Vector那么没有特殊说明当然要保持原来的OBJECT。 : 象这种Array-based List Insert / Remove效率本来就低 : 如果要经常modify就不要用ArrayList, Vector, 用HashSet : 或者commons.collections.*
|
F****n 发帖数: 3271 | 33 其实用过C或者经常用数组的很容易倾向于好虫的方法
想想如果是数组,肯定要新开一个,删掉位移估计只内存不够时才会用
而VECTOR其实就是一个封装的数组。
【在 m******t 的大作中提到】 : : Nah, bug was right as far as the point at discussion : - which is the "count down with in-place shift" approach. : I forgot that the performance wasn't necessarily monotonic. : I still do prefer this approach over having another : toBeDeleted collection though, because it's simple and : easy to read, and not really significantly slower unless : N is huge.
|
m******t 发帖数: 2416 | 34
That isn't immediately obvious to me though. The filter
condition may be completely orthogonal to these objects'
equals/hashCode impl.
【在 c*****t 的大作中提到】 : I was just saying that your understanding that (hashset is good for : hot removal) is wrong. You brought up hashset first, not me. Obviously, : if there are multiple objects fit the filter, it is either not a : set or there are not advantages using the hashset. : All I was trying to point out in the previous posts were that if : you have to do sequential search anyways to remove multiple objects : array / vector are a good choice due to very low overhead.
|
c*****t 发帖数: 1879 | 35 I listed two possibilities in that sentence, hehe :)
1. same as the hashcode, in that case, there aren't many objects but one,
which isn't what LZ wanted.
2. different from hashcode. Then one has to do sequential scan since
the container itself doesn't provide the quick access. Thus there
are no advantages using the HashSet all. Might as well just use
array/vector.
amount of finesse."
【在 m******t 的大作中提到】 : : That isn't immediately obvious to me though. The filter : condition may be completely orthogonal to these objects' : equals/hashCode impl.
|
m******t 发帖数: 2416 | 36
Right, that's the worse scenario out of the two.
I thought the idea was to use a set to temporarily hold
the items to be removed. So when we call removeAll on the
original vector, each lookup in the set would cost constant
time ideally.
【在 c*****t 的大作中提到】 : I listed two possibilities in that sentence, hehe :) : 1. same as the hashcode, in that case, there aren't many objects but one, : which isn't what LZ wanted. : 2. different from hashcode. Then one has to do sequential scan since : the container itself doesn't provide the quick access. Thus there : are no advantages using the HashSet all. Might as well just use : array/vector. : : amount of finesse."
|
c*****t 发帖数: 1879 | 37
That's not a good approach, because of repeated calling of hashCode () and
lookup of the object in the
hashset which is slow. A better approach is:
final int size = vec.size ();
Vector newVec = new Vector (size);
for (int i = 0; i < size; ++i)
{
A a = (A)vec.get (i); // very fast op due to direct array retrieval
if (a.b != 5) // or whatever filter that indicates keeping the element
newVec.add (a); // very fast op since we have all the capacity set
}
vec.clear (); // very fast op,
【在 m******t 的大作中提到】 : : Right, that's the worse scenario out of the two. : I thought the idea was to use a set to temporarily hold : the items to be removed. So when we call removeAll on the : original vector, each lookup in the set would cost constant : time ideally.
|
F****n 发帖数: 3271 | 38 coconut真有意思,研究了半天终于发现了我前面说的。
【在 c*****t 的大作中提到】 : : That's not a good approach, because of repeated calling of hashCode () and : lookup of the object in the : hashset which is slow. A better approach is: : final int size = vec.size (); : Vector newVec = new Vector (size); : for (int i = 0; i < size; ++i) : { : A a = (A)vec.get (i); // very fast op due to direct array retrieval : if (a.b != 5) // or whatever filter that indicates keeping the element
|
c*****t 发帖数: 1879 | 39 What do you mean? This is just a more detailed explanation which I posted
in 10412.
will
【在 F****n 的大作中提到】 : coconut真有意思,研究了半天终于发现了我前面说的。
|