Adobe面试题，怎么能把很多文件读到Memory ? - JobHunting版 - 未名存档

本页内容为未名空间相应帖子的节选和存档，一周内的贴子最多显示50字，超过一周显示500字访问原贴

JobHunting版 - Adobe面试题，怎么能把很多文件读到Memory ?

相关主题
● 如何用R处理大文件 (转载)	● 分享面试题
● 昨天的google面试题	● 一个linux简单面试题
● [合集] 昨天的google面试题	● Linux context switch 高通面试题。？？
● 问一个不知道算是软工还是C++的面试题	● amazon 面试题Create a class for filesystem
● 贴个刚才的电话面试题	● 讨论几个面试题
● 请教一个新鲜算法面试题	● 请教一道FB的面试题
● 问2道面试题	● SQL, recruiter发过来的面试题
● 真诚请教: 如何准备系统设计类的面试题	● 求助：面试题

相关话题的讨论汇总
话题: memory话题: adobe话题: files话题: 磁盘话题: 文件

进入JobHunting版参与讨论

1

(共1页)

d**s 发帖数: 920	1 Went to Adobe to interview a Senior SW Engineer position, 总的interview的不错，但被下面问题问倒了，让回去想想， Q1: "We need to compare thousands text files with each other, they are not big, less than 100K each. They are in a directories tree, with a few levels of subdirectories, how to speed up the comparing process ?" My answers: We can read them all of these files into memory once so that we can reduce the number of diso I/O. [Feedback: That is a good appoach]. Q2: How to read these files into memory (on MS Wi
a**********s 发帖数: 588	2 Shocking, adobe开始招人了吗？
s*******i 发帖数: 712	3 说说Q3 我觉得关键是频繁从磁盘里读取小文件不仅I/O慢，而且浪费了磁盘的带宽。改进这个瓶颈可以通过增强磁盘上数据的locality和充分利用磁盘带宽。 1. 利用磁盘带宽：把磁盘划分为某个合适大小的区域(如128KB)，称为cluster。 cluster的大小决定于磁盘带宽，能充分利用一次读入的吞吐量。 2. 提供一个locality算法把相关性高的小文件尽量归到一个cluster里。这样从磁盘读取文件时以cluster为单位，这样既充分利用了磁盘带宽，还由于 locality，在内存里处理该文件后，接下来要处理的文件很有可能就在你读入的cluster里了。减少I/O次数。 Q1。可能也是类似的情况，具体怎么弄我也不知道。但你说一次性读入所有文件未必合适。这些文件加起来有好几百M了吧。 Q2. 是不是和文件和目录在磁盘上的分布有关系？谁来说说文件和目录在磁盘中怎么放的？有啥规律吗？ big, of that we how do 【在 d**s 的大作中提到】 : Went to Adobe to interview a Senior SW Engineer position, : 总的interview的不错，但被下面问题问倒了，让回去想想， : Q1: : "We need to compare thousands text files with each other, they are not big, : less than 100K each. They are in a directories tree, with a few levels of : subdirectories, how to speed up the comparing process ?" : My answers: We can read them all of these files into memory once so that we : can reduce the number of diso I/O. : [Feedback: That is a good appoach]. : Q2: How to read these files into memory (on MS Wi
d**s 发帖数: 920	4 Thanks, actually, they just want to read all files into memory, because totally only a few hundred M bytes in memory, so that is not an issue. 【在 s*******i 的大作中提到】 : 说说Q3 : 我觉得关键是频繁从磁盘里读取小文件不仅I/O慢，而且浪费了磁盘的带宽。改进这个 : 瓶颈可以通过 : 增强磁盘上数据的locality和充分利用磁盘带宽。 : 1. 利用磁盘带宽：把磁盘划分为某个合适大小的区域(如128KB)，称为cluster。 : cluster的大 : 小决定于磁盘带宽，能充分利用一次读入的吞吐量。 : 2. 提供一个locality算法把相关性高的小文件尽量归到一个cluster里。 : 这样从磁盘读取文件时以cluster为单位，这样既充分利用了磁盘带宽，还由于 : locality，在内存
y*r 发帖数: 590	5 Q2, it seems a tree is not bad here ? Really have no idea about Q3 .... someone got a good iea? , we 【在 d**s 的大作中提到】 : Went to Adobe to interview a Senior SW Engineer position, : 总的interview的不错，但被下面问题问倒了，让回去想想， : Q1: : "We need to compare thousands text files with each other, they are not big, : less than 100K each. They are in a directories tree, with a few levels of : subdirectories, how to speed up the comparing process ?" : My answers: We can read them all of these files into memory once so that we : can reduce the number of diso I/O. : [Feedback: That is a good appoach]. : Q2: How to read these files into memory (on MS Wi
s******8 发帖数: 4192	6 Q3 detour

1

(共1页)

进入JobHunting版参与讨论

相关主题
● 求助：面试题	● 贴个刚才的电话面试题
● 请教电面试题	● 请教一个新鲜算法面试题
● 一道T的题。	● 问2道面试题
● 等OPT无聊，贡献一个IQ面试题	● 真诚请教: 如何准备系统设计类的面试题
● 如何用R处理大文件 (转载)	● 分享面试题
● 昨天的google面试题	● 一个linux简单面试题
● [合集] 昨天的google面试题	● Linux context switch 高通面试题。？？
● 问一个不知道算是软工还是C++的面试题	● amazon 面试题Create a class for filesystem

相关话题的讨论汇总
话题: memory话题: adobe话题: files话题: 磁盘话题: 文件

未名新帖统计// 7月16日

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

* 这里只显示发帖超过25的版面，努力灌水吧:-)