由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
DataSciences版 - Memory Error in pandas.concat with Python
相关主题
如何改变spark dataframe的column names怎样利用AMS在R里面做一个大数据的分析?
Spark开始使用DataFrame求指点-怎样提高python水平?
在R里merge两个dataframe太慢了python 网络爬虫和数据处理
什么叫做大数据?python for data analysis
请教各位DS大拿kaggle上这个restaurant-revenue-prediction的题目有人考虑过么?
python用起来没有matlab好使,尤其是数据处理问一道(大)数据 algorithm (转载)
如何用python读取大数据spark上一两个million的时间序列数据
sort a matrix (1M rows x 100 columns) for each row in GPU有什么模型能把linear regression model 和 time series model (转载)
相关话题的讨论汇总
话题: python话题: memory话题: error话题: dataframes
进入DataSciences版参与讨论
1 (共1页)
y********0
发帖数: 638
1
I am more recently working with 80 dataframes in python each of which holds
650K rows with the same 11 columns varying from string to number. My initial
goal is to concatenate all dataframes and make a general analysis but I keep
hitting the dead ends in doing so.
1. I started from pd.concat([df1,df2]) but ended up with memory error and
failed to figure out a solution after searching over internet.
2.I then, checking on some online suggestions, converted the number into
float32 for the reduction of memory burden and this time only found it not
helpful either.
3. If you are asking for a little background of my PC and python,I am using
a 32-bit Python with 2.7.5, pandas 0.13.0 and little chance to switch to 64
version.
Any comments, suggestions will be really appreciated.
Z**0
发帖数: 1119
2
You could not do anything given your current configuration.
The final datafame is roughly about 4GiB (float64, aka 8 bytes per element),
which won't work on 32-bit at all. Your 32-bit program could only request
2GiB memory block either.
y********0
发帖数: 638
3
Thanks Zer0.

),

【在 Z**0 的大作中提到】
: You could not do anything given your current configuration.
: The final datafame is roughly about 4GiB (float64, aka 8 bytes per element),
: which won't work on 32-bit at all. Your 32-bit program could only request
: 2GiB memory block either.

1 (共1页)
进入DataSciences版参与讨论
相关主题
有什么模型能把linear regression model 和 time series model (转载)请教各位DS大拿
[挖个坑]数据分析都有哪些开源工具呀?python用起来没有matlab好使,尤其是数据处理
Re: 请问大数据问题和以前的数据挖掘有什么区别? (转载)如何用python读取大数据
求data analysis/engineer/scientist intern的面试经验及就业方向指导 谢谢!sort a matrix (1M rows x 100 columns) for each row in GPU
如何改变spark dataframe的column names怎样利用AMS在R里面做一个大数据的分析?
Spark开始使用DataFrame求指点-怎样提高python水平?
在R里merge两个dataframe太慢了python 网络爬虫和数据处理
什么叫做大数据?python for data analysis
相关话题的讨论汇总
话题: python话题: memory话题: error话题: dataframes