由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
Programming版 - 想请教大牛们一个处理categorical variable in Machine Learning 的问题
相关主题
encode high cardinality categorical features学machine learning需要先修统系计的课吗?
how's scikit-learn, what else package is good for machine learning ?【申请新版】 数据科学(DataScience) (转载)
有大牛可以说说scikit-learn哪些方面不如tf么?说完了AI,有没有人愿意讨论下ML?
Machin learning is hype招Java developer中有要求Machine Learning 具体涉及哪些技术?
老生常谈,民科问应该如何处理分类(categorical)变量?板上有修Coursera上的machine learning课程的么?
deep learning现在还是冰山一角求推荐machine learning和data mining的书
有知道machine learning, data mining 的同学吗?想学习Search方面的技术,怎么入门?看什么比较好
学machine learning需要先修AI的课吗?如何学火花
相关话题的讨论汇总
话题: variable话题: learning话题: machine话题: h2o
进入Programming版参与讨论
1 (共1页)
l******0
发帖数: 2
1
Say I have a relative big dataset which has a categorical variable with many
possible values/levels, for example, country.
If I do one-hot encoding as suggested by scikit learn, I get the error of "
out of memory". But when I load the data into R and treat the variable as a
normal factor and call some R machine learning library or H2o, everything
works fine, at least no error message and the results are acceptable. So I'm
wondering how does R or H2o treat it differently and what's the correct way
to handle this kind of problem.
W***o
发帖数: 6519
2
是不是可以这样:
1 united states
0 other
这样循环所有的country, 当前的 country 用1表示
这样每次循环只有两个国家
m******r
发帖数: 1033
3
这得看你用了具体哪个library吧
1 (共1页)
进入Programming版参与讨论
相关主题
如何学火花老生常谈,民科问应该如何处理分类(categorical)变量?
Learning Curves (for different programming languages) (转载)deep learning现在还是冰山一角
谁推荐个 machine learning 入门教程有知道machine learning, data mining 的同学吗?
有人参见过International Conference on Machine Learning吗?学machine learning需要先修AI的课吗?
encode high cardinality categorical features学machine learning需要先修统系计的课吗?
how's scikit-learn, what else package is good for machine learning ?【申请新版】 数据科学(DataScience) (转载)
有大牛可以说说scikit-learn哪些方面不如tf么?说完了AI,有没有人愿意讨论下ML?
Machin learning is hype招Java developer中有要求Machine Learning 具体涉及哪些技术?
相关话题的讨论汇总
话题: variable话题: learning话题: machine话题: h2o