m****r 发帖数: 141 | 1 This is an interview question. The interview has been done.
Given a sequence of data (it may have duplicates), a fixed-sized moving
window, move the window at each iteration from the start of the data
sequence, such that
(1) the oldest data element is removed from the window and a new data
element is pushed into the window
(2) find the median of the data inside the window at each moving.
My idea:
Use 2 heaps to hold median. In side the window, sort the data in the window
in the first iteration, the min heap holds the larger part and the max heap
holds the smaller part. If the window has odd number of data, the max heap
returns the median otherwise the arithmetic mean of the top elements of the
two heaps is the median.
When a new data is pushed in to the window, remove the oldest data from one
of the heap and compare the new data with the top of max and min heap so
that to decide which heap the data to be put. Then, find the median just
like in the first iteration.
But, how to find a data element in a heap is a problem. Heap is a binary
tree not a binary search tree.
Any help is really appreciated.
Thanks | h**********l 发帖数: 6342 | 2 除了第一次,以后应该都是 O(1)
第一次sort一下window里面n个数记下median
然后删一个数,跟median比较,你就知道左边还是右边,或者median被删,也是一样的
新加一个数,也知道median左边右边
如果在median两边, median不变,
不然就是median旁边一个是新的median,左边右边判断一下
window是偶数情况应该类似的
window
heap
【在 m****r 的大作中提到】 : This is an interview question. The interview has been done. : Given a sequence of data (it may have duplicates), a fixed-sized moving : window, move the window at each iteration from the start of the data : sequence, such that : (1) the oldest data element is removed from the window and a new data : element is pushed into the window : (2) find the median of the data inside the window at each moving. : My idea: : Use 2 heaps to hold median. In side the window, sort the data in the window : in the first iteration, the min heap holds the larger part and the max heap
| l***i 发帖数: 1309 | | s******n 发帖数: 3946 | 4 新的数插哪里你没说
【在 h**********l 的大作中提到】 : 除了第一次,以后应该都是 O(1) : 第一次sort一下window里面n个数记下median : 然后删一个数,跟median比较,你就知道左边还是右边,或者median被删,也是一样的 : 新加一个数,也知道median左边右边 : 如果在median两边, median不变, : 不然就是median旁边一个是新的median,左边右边判断一下 : window是偶数情况应该类似的 : : window : heap
| s******n 发帖数: 3946 | 5 window分成1个maxheap紧接1个minheap,保持2个heap的大小最多差1,maxheap的最大
值小于等于minheap的最小值。复杂度每插入一个数是log(L),L是window的大小。 | h**********l 发帖数: 6342 | 6 应该继续放在sorted的 window里面, window 大小的log复杂度
【在 s******n 的大作中提到】 : 新的数插哪里你没说
| c****9 发帖数: 164 | 7 如果每次都是删除oldest element的话需要一个额外的array来存window大小的按照old
顺序排序的吧,其他同happy的解法 | m****r 发帖数: 141 | 8 thanks,
How to find the oldest element in the heaps ?
【在 s******n 的大作中提到】 : window分成1个maxheap紧接1个minheap,保持2个heap的大小最多差1,maxheap的最大 : 值小于等于minheap的最小值。复杂度每插入一个数是log(L),L是window的大小。
| s******n 发帖数: 3946 | 9 没看到这个要求。。。如果有这个要求就复杂了,得搞double-link-list,heap调整的
时候也要调整link-list
【在 m****r 的大作中提到】 : thanks, : How to find the oldest element in the heaps ?
| m****i 发帖数: 650 | 10 要往sort windows里插入数字,看你的sorted window是什么结构了。 假设window
size是L。如果 window是array,那么找到要删除的数字是lg(L),找到要插入的位子lg(
L), 移动2个数字之间的数字要L. 也就是每次都要O(L)的时间复杂度。 如果我windows
是linklist也是O(L), 所以要想log(L)就要用BST或skip list. 不知道你说的O(1)如
何得到的。
【在 h**********l 的大作中提到】 : 应该继续放在sorted的 window里面, window 大小的log复杂度
| w****o 发帖数: 2260 | 11 好难呀
【在 m****r 的大作中提到】 : This is an interview question. The interview has been done. : Given a sequence of data (it may have duplicates), a fixed-sized moving : window, move the window at each iteration from the start of the data : sequence, such that : (1) the oldest data element is removed from the window and a new data : element is pushed into the window : (2) find the median of the data inside the window at each moving. : My idea: : Use 2 heaps to hold median. In side the window, sort the data in the window : in the first iteration, the min heap holds the larger part and the max heap
| m****r 发帖数: 141 | 12 So, in your way, the total complexity is
O(n * m), where n is number of data, m is the window size.
Space complexity: O(m), you need to copy the data in the window to an array.
Right ?
lg(
windows
【在 m****i 的大作中提到】 : 要往sort windows里插入数字,看你的sorted window是什么结构了。 假设window : size是L。如果 window是array,那么找到要删除的数字是lg(L),找到要插入的位子lg( : L), 移动2个数字之间的数字要L. 也就是每次都要O(L)的时间复杂度。 如果我windows : 是linklist也是O(L), 所以要想log(L)就要用BST或skip list. 不知道你说的O(1)如 : 何得到的。
| c**********e 发帖数: 2007 | 13 My solution: Suppose the window width is L.
Use two heaps, one min-heap and one max-heap, and an array Node* arr[L],
which are pointers to the nodes.
Each node of the heap is a structure. It not only includes the data, a left
pointer, and a right pointer, but also an integer. The integer is the
location of the node in the array.
When we delete an value from the window, we know its location i (0 <= i < L)
. By arr[i], we find the node in either heap, and we delete it. When we
delete the node, some other nodes will also be "moved" in the heap, so there
pointers in arr[] will also be updated. The operation is O(log(L)).
For the new value, we make a new node, and insert to either heap. The
operation is also O(log(L)).
So the total complexity is O(n*log(L)). |
|