AttributeError: #39;NoneType#39; object has no attribute #39;setCallSite#39;(AttributeError: NoneType 对象没有属性 setCallSite)
问题描述
在 PySpark 中,我想计算两个数据帧向量之间的相关性,使用以下代码(我在导入 pyspark 或 createDataFrame 时没有任何问题):
In PySpark, I want to calculate the correlation between two dataframe vectors, using the following code (I do not have any problem in importing pyspark or createDataFrame):
from pyspark.ml.linalg import Vectors
from pyspark.ml.stat import Correlation
import pyspark
spark = pyspark.sql.SparkSession.builder.master("local[*]").getOrCreate()
data = [(Vectors.sparse(4, [(0, 1.0), (3, -2.0)]),),
(Vectors.dense([4.0, 5.0, 0.0, 3.0]),)]
df = spark.createDataFrame(data, ["features"])
r1 = Correlation.corr(df, "features").head()
print("Pearson correlation matrix:
" + str(r1[0]))
但是,我得到了 AttributeError (AttributeError: 'NoneType' object has no attribute 'setCallSite') 为:
But, I got the AttributeError (AttributeError: 'NoneType' object has no attribute 'setCallSite') as:
AttributeError Traceback (most recent call last)
<ipython-input-136-d553c1ade793> in <module>()
6 df = spark.createDataFrame(data, ["features"])
7
----> 8 r1 = Correlation.corr(df, "features").head()
9 print("Pearson correlation matrix:
" + str(r1[0]))
/usr/local/lib/python3.6/dist-packages/pyspark/sql/dataframe.py in head(self, n)
1130 """
1131 if n is None:
-> 1132 rs = self.head(1)
1133 return rs[0] if rs else None
1134 return self.take(n)
/usr/local/lib/python3.6/dist-packages/pyspark/sql/dataframe.py in head(self, n)
1132 rs = self.head(1)
1133 return rs[0] if rs else None
-> 1134 return self.take(n)
1135
1136 @ignore_unicode_prefix
/usr/local/lib/python3.6/dist-packages/pyspark/sql/dataframe.py in take(self, num)
502 [Row(age=2, name=u'Alice'), Row(age=5, name=u'Bob')]
503 """
--> 504 return self.limit(num).collect()
505
506 @since(1.3)
/usr/local/lib/python3.6/dist-packages/pyspark/sql/dataframe.py in collect(self)
463 [Row(age=2, name=u'Alice'), Row(age=5, name=u'Bob')]
464 """
--> 465 with SCCallSiteSync(self._sc) as css:
466 port = self._jdf.collectToPython()
467 return list(_load_from_socket(port, BatchedSerializer(PickleSerializer())))
/usr/local/lib/python3.6/dist-packages/pyspark/traceback_utils.py in __enter__(self)
70 def __enter__(self):
71 if SCCallSiteSync._spark_stack_depth == 0:
---> 72 self._context._jsc.setCallSite(self._call_site)
73 SCCallSiteSync._spark_stack_depth += 1
74
AttributeError: 'NoneType' object has no attribute 'setCallSite'
有什么解决办法吗?
推荐答案
有一个 open 解决了这个问题:
There's an open resolved issue around this:
https://issue.apache.org/jira/browse/SPARK-27335?jql=text%20~%20%22setcallsite%22
[注意:问题已解决,如果您使用的是比 2019 年 10 月更新的 Spark 版本,如果您仍然遇到此问题,请向 Apache Jira 报告]
[Note: as it's resolved, if you're using a more recent version of Spark than October 2019, please report to Apache Jira if you're still encountering this issue]
海报建议强制将 DF 的后端与 Spark 上下文同步:
The poster suggests forcing to sync your DF's backend with your Spark context:
df.sql_ctx.sparkSession._jsparkSession = spark._jsparkSession
df._sc = spark._sc
这对我们有用,希望在其他情况下也能用.
This worked for us, hopefully can work in other cases as well.
这篇关于AttributeError: 'NoneType' 对象没有属性 'setCallSite'的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:AttributeError: 'NoneType' 对象没有属性 'se
基础教程推荐
- 使用PyInstaller后在Windows中打开可执行文件时出错 2022-01-01
- 何时使用 os.name、sys.platform 或 platform.system? 2022-01-01
- 线程时出现 msgbox 错误,GUI 块 2022-01-01
- Dask.array.套用_沿_轴:由于额外的元素([1]),使用dask.array的每一行作为另一个函数的输入失败 2022-01-01
- 在 Python 中,如果我在一个“with"中返回.块,文件还会关闭吗? 2022-01-01
- 如何让 python 脚本监听来自另一个脚本的输入 2022-01-01
- Python kivy 入口点 inflateRest2 无法定位 libpng16-16.dll 2022-01-01
- 筛选NumPy数组 2022-01-01
- 用于分类数据的跳跃记号标签 2022-01-01
- 如何在海运重新绘制中自定义标题和y标签 2022-01-01