Skip to content

make URIStatus Serializable#18746

Open
chenruotao wants to merge 1 commit intoAlluxio:mainfrom
chenruotao:feat/uristatus_serialiable
Open

make URIStatus Serializable#18746
chenruotao wants to merge 1 commit intoAlluxio:mainfrom
chenruotao:feat/uristatus_serialiable

Conversation

@chenruotao
Copy link
Copy Markdown

What changes are proposed in this pull request?

Make URIStatus Serializable

Why are the changes needed?

we set the hudi table location to the alluxio protocol like 'alluxio://' ,
an exception occurs when scanning partition paths to obtain metadata.
Because if used the Spark Core API to scan the file directory, the object should to be serialized and transmitted to the driver.

Caused by: org.apache.hudi.exception.HoodieException: Error fetching partition paths from metadata table
        at org.apache.hudi.common.fs.FSUtils.getAllPartitionPaths(FSUtils.java:317)
        at org.apache.hudi.BaseHoodieTableFileIndex.getAllQueryPartitionPaths(BaseHoodieTableFileIndex.java:194)
        at org.apache.hudi.BaseHoodieTableFileIndex.loadPartitionPathFiles(BaseHoodieTableFileIndex.java:237)
        at org.apache.hudi.BaseHoodieTableFileIndex.doRefresh(BaseHoodieTableFileIndex.java:282)
        at org.apache.hudi.BaseHoodieTableFileIndex.<init>(BaseHoodieTableFileIndex.java:147)
        at org.apache.hudi.SparkHoodieTableFileIndex.<init>(SparkHoodieTableFileIndex.scala:73)

Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: task 0.0 in stage 0.0 (TID 0) had a not serializable result: alluxio.client.file.URIStatus
Serialization stack:
        - object not serializable (class: alluxio.client.file.URIStatus, value: URIStatus{info=FileInfo{fileId=83969966091, name=.hoodie, path=xxx/.hoodie, ufsPath=xxx/.hoodie, length=9, blockSizeBytes=0, creationTimeMs=1770369559402, completed=true, folder=true, pinned=false, pinnedlocation=[], cacheable=false, persisted=true, blockIds=[], inMemoryPercentage=0, lastModificationTimesMs=1770369226735, ttl=-1, lastAccessTimesMs=1770966814846, ttlAction=FREE, owner=xxx, group=hdfs, mode=457, persistenceState=PERSISTED, mountPoint=false, replicationMax=0, replicationMin=0, fileBlockInfos=[], mountId=8703550100576809600, inAlluxioPercentage=0, ufsFingerprint=, acl=user::rwx,group::--x,other::--x, defaultAcl=, xattr=[]}, cacheContext=null})
        - field (class: alluxio.hadoop.AlluxioFileStatus, name: mUriStatus, type: class alluxio.client.file.URIStatus)
        - object (class alluxio.hadoop.AlluxioFileStatus, AlluxioFileStatus{path=xxx/.hoodie; isDirectory=true; modification_time=1770369226735; access_time=1770966814846; owner=xxx; group=hdfs; permission=rwx--x--x; isSymlink=false; hasAcl=false; isEncrypted=false; isErasureCoded=false})

Does this PR introduce any user facing changes?

No

@alluxio-bot
Copy link
Copy Markdown
Contributor

Thank you for your pull request.
In order for us to evaluate and accept your PR, we ask that you sign a contribution license agreement (CLA).
It's all electronic and will take just a few minutes. Please download CLA form here, sign, and e-mail back to cla@alluxio.org

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants