由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
Programming版 - error of couting total line number in txt file on MSDOS (转载)
相关主题
MinGW在win7上就是一个垃圾windows 7 下面大家都用什么C编程软件?
a question about bitwise operation外行问个compile的事, 大牛们帮帮我
死了,哪里有mingw的入门介绍?develop linux apps on Windows
g++ for Windowsstack overflow problem
怎样在cygwin里安装老的gcc版本?cout怎么不换行输出大量字符串?
问一个很弱的c++ cin的问题关于C++中一个Class的大小 (转载)
makefile 里面那个-D flags的问题关于buffer overflow
please recommend an easy windows C debugger for a beginner什么时候写程序要注意这个问题?
相关话题的讨论汇总
话题: number话题: file话题: msdos话题: couting话题: ms
进入Programming版参与讨论
1 (共1页)
l******9
发帖数: 579
1
【 以下文字转载自 JobHunting 讨论区 】
发信人: light009 (light009), 信区: JobHunting
标 题: error of couting total line number in txt file on MSDOS
发信站: BBS 未名空间站 (Thu Nov 20 18:34:45 2014, 美东)
I would like to find the total line number of a text file ( > 60 GB) in MS-
DOS.
I used:
findstr /R /N "^" file.txt | find /C ":"
But, the returned result is a negative number.
It is overflow ?
The file have not more than 5 billion lines.
For an integer (4 Bytes), its max range is From −2,147,483,648 to 2,
147,483,647.
So, I need to design a script to count the number by dividing the result
with 1000 ?
If yes, please help me with how to design the script in MS DOS.
Thanks
S*A
发帖数: 7142
2
Linux has "wc -l"
That might just work out of the box.
S*A
发帖数: 7142
3
Wc 源码在这里。
https://www.gnu.org/software/cflow/manual/html_node/Source-of-wc-command.
html
是 unsigned long。所以64位编译就是
64位的counter了。
l******9
发帖数: 579
4
I used "wc -l" but I got a wrong number for a large file (120 GB).
The result is a positive number but it is wrong.
Any help would be appreciated.

【在 S*A 的大作中提到】
: Wc 源码在这里。
: https://www.gnu.org/software/cflow/manual/html_node/Source-of-wc-command.
: html
: 是 unsigned long。所以64位编译就是
: 64位的counter了。

w********m
发帖数: 1137
5
用python吧
空间O(1),时间O(n)
cnt = 0
with open('file.txt', 'r') as infile:
for _ in infile:
cnt += 1
print cnt
空间O(n), 时间O(n/k)
import pyspark
sc = pyspark.SparkContext()
infile = sc.textFile('file.txt')
print infile.count()
n*****t
发帖数: 22014
6
怎么个错法?是不是超过 2^16?

【在 l******9 的大作中提到】
: I used "wc -l" but I got a wrong number for a large file (120 GB).
: The result is a positive number but it is wrong.
: Any help would be appreciated.

l******9
发帖数: 579
7
I am not sure,
I used cygwin to access the server where the file is located by SSH.
Then, I ran "wc -l" to get the wrong number, but it is not overflow because
it is positive not negative.

【在 n*****t 的大作中提到】
: 怎么个错法?是不是超过 2^16?
l******9
发帖数: 579
8
I am not allowed to install python on the server.
I can only access the file remotely. This will make the time very long for a
large file 120 GB.

【在 w********m 的大作中提到】
: 用python吧
: 空间O(1),时间O(n)
: cnt = 0
: with open('file.txt', 'r') as infile:
: for _ in infile:
: cnt += 1
: print cnt
: 空间O(n), 时间O(n/k)
: import pyspark
: sc = pyspark.SparkContext()

S*A
发帖数: 7142
9
cygwin 是 32 位的,64 位的只有 alpha,跑起来问题很多。
你需要用 64 位的 wc。
你可以实验一下 mingw 64 位,那个如果包含 wc
应该就是64的。
1 (共1页)
进入Programming版参与讨论
相关主题
什么时候写程序要注意这个问题?怎样在cygwin里安装老的gcc版本?
在帮忙看看这个吧 C: int->char*问一个很弱的c++ cin的问题
【请教】cygwin的删除问题? (转载)makefile 里面那个-D flags的问题
cygwin is not working in the DOS windowplease recommend an easy windows C debugger for a beginner
MinGW在win7上就是一个垃圾windows 7 下面大家都用什么C编程软件?
a question about bitwise operation外行问个compile的事, 大牛们帮帮我
死了,哪里有mingw的入门介绍?develop linux apps on Windows
g++ for Windowsstack overflow problem
相关话题的讨论汇总
话题: number话题: file话题: msdos话题: couting话题: ms