由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
DataSciences版 - 急,跪求答案 (moving avg using spark dataframe window functions)
相关主题
哪里可以免费的练习一下pig/hive/spark的?Ask help for R data restructure, Thanks!!!
大数据这个东西,如果用hive,岂不是跟SQL差不多了请问大家有关mixed model (转载)
big data software engineer或者data scientist 工作机会推荐 (转载)征集版标
What's the best way to convert text/csv file into PARQUET现在的大数据技术的价值和功用有些被夸大了
请教做 data science 的 ICCC请问如何用JDBC连接R和Hive (转载)
讨论,(Big)Data Engineer到底是个什么职位三星samsung创新部门招大数据工程师 (转载)
python sklearn nearest neighbor user defined metricImpala v Hive
【包子求】BFGS-Matlab package请问大家有没有直接用java全程写mapreduce的程序的?
相关话题的讨论汇总
话题: sqlcontext话题: 07话题: window话题: functions
进入DataSciences版参与讨论
1 (共1页)
w**2
发帖数: 147
1
请教大牛们,如何用window functions来算出 3day moving avg。那个error msg看不
懂呢,为啥要hive context。
多谢了~
例子如下,
from pyspark.sql import Window
from pyspark.sql import SQLContext
import pyspark.sql.functions as func
Table T:
Date Num
07/01 2
07/02 3
07/03 2
07/04 2
07/05 5
07/06 6
07/07 7
sqlCtx = SQLContext(sc)
T.registerTempTable(“T”)
w = Window.partitionBy(T.Date).orderBy(T.Date).rangeBetween(-2,0)
a = (func.avg(T["Num"]).over(w))
T.select(T["Date"],T["Num"],a.alias("moving_avg"))
Error Msg:
Could not resolve window function 'avg'. Note that, using window functions
currently requires a HiveContext;
S*******e
发帖数: 525
2
SQLContext only supports very limited SQL functions. HiveContext supports
many functions such as what you need. Anything SQLContext supports, the
HiveContext will support.
I think you only change "from pyspark.sql import SQLContext ", to
"from pyspark.sql import HiveContext " and change "sqlCtx = SQLContext(sc)"
to "sqlCtx = HiveContext(sc)" will work (by the way, I have very limited
knowledge on python. I mainly use Java to do Spark).
w**2
发帖数: 147
3
太感谢了。希望1.5.0版本可以有改进吧。
1 (共1页)
进入DataSciences版参与讨论
相关主题
请问大家有没有直接用java全程写mapreduce的程序的?请教做 data science 的 ICCC
big set intersection in pig讨论,(Big)Data Engineer到底是个什么职位
你们用的都是pig吗?python sklearn nearest neighbor user defined metric
初入data science的困惑【包子求】BFGS-Matlab package
哪里可以免费的练习一下pig/hive/spark的?Ask help for R data restructure, Thanks!!!
大数据这个东西,如果用hive,岂不是跟SQL差不多了请问大家有关mixed model (转载)
big data software engineer或者data scientist 工作机会推荐 (转载)征集版标
What's the best way to convert text/csv file into PARQUET现在的大数据技术的价值和功用有些被夸大了
相关话题的讨论汇总
话题: sqlcontext话题: 07话题: window话题: functions