l****u 发帖数: 84 | 1 I am going to import very large stock market data from multiple .txt files (
over 3000 files) into sql database. I was wondering which is the most
efficient way to do that?
I used bulk insert and it takes me forever time (estimated 7 days) |
B*****g 发帖数: 34098 | 2 SSIS
(
【在 l****u 的大作中提到】 : I am going to import very large stock market data from multiple .txt files ( : over 3000 files) into sql database. I was wondering which is the most : efficient way to do that? : I used bulk insert and it takes me forever time (estimated 7 days)
|
j*****n 发帖数: 1781 | 3 No dude, SSIS is slower than bcp...
if I am doing this, i'd like to have dba to extend the db size to at least
100GB more with multiple data files on different disks; also i'd like to set
up db to simple recovery model, and set bcp commitment to every 500,000
rows; oh remember remove all indexes from the table first.
anything else?
【在 B*****g 的大作中提到】 : SSIS : : (
|
p********l 发帖数: 279 | 4 Pretty much these things. Specifying batchsize is important.
set
【在 j*****n 的大作中提到】 : No dude, SSIS is slower than bcp... : if I am doing this, i'd like to have dba to extend the db size to at least : 100GB more with multiple data files on different disks; also i'd like to set : up db to simple recovery model, and set bcp commitment to every 500,000 : rows; oh remember remove all indexes from the table first. : anything else?
|
c*******t 发帖数: 55 | 5 1. Bulk Insert is faster than SSIS if you turn on the table lock.
2. If you can make the data in all files are sorted, you can save the time
on index rebuild.
3. No matter Simple Recovery or Bulk Mode, SQL Server will start writing
huge data to log file after the first file, the size usually be 2~3 times
the data you are importing, so I suggest you shrink the log file after each
bulk-insert each file. (MSSQL will not write too much data to the log file
if the table is empty) |
i***c 发帖数: 301 | 6 write a class,do sql insert only |
j*******7 发帖数: 6300 | 7 I think the most important is to load the files in parallel, other
considerations include:
set DB recovery mode to be Simple
pre-allocate database files and optimize the disk system
are you going to load to a single table? can consider a partitioned design
perhaps. |
G********d 发帖数: 593 | 8 maybe because of tons of reason
RAID?
recovery model?
even with SSIS, it should not take such long time...
check your log/locks and if you have profiler running you can monitor it. |
c*******o 发帖数: 8869 | 9 SAS is a way to go
(
【在 l****u 的大作中提到】 : I am going to import very large stock market data from multiple .txt files ( : over 3000 files) into sql database. I was wondering which is the most : efficient way to do that? : I used bulk insert and it takes me forever time (estimated 7 days)
|
c*****y 发帖数: 75 | |
W******9 发帖数: 191 | |
w*******e 发帖数: 1622 | 12 Now, it is SSIS....
【在 W******9 的大作中提到】 : DTS, easy and fast
|