奥克兰体验生活一周


有好几个前同事到奥克兰工作了,所以我计划体验一下奥克兰的生活。 查看机票价格,海航往返含税2810元,深圳中转可以顺便回趟家,值得入手。行程如下:

2019.6.21 PEK-SZX
2019.6.22 SZX-AKL
2019.6.29 AKL-SZX
2019.7.01 SZX-PEK

在 Airbnb 预定了市中心的公寓,两个房间三张床,有客厅厨房,阳台可以看到海景。7晚连同手续费大概5800元人民币。大多数人来奥克兰只是中转,通常会转机到南岛。我们没车到处玩耍,就安心地在奥克兰体验生活吧。

买票的时候想着室友大概率请不到假,我很可能要一个人去,所以还买了我妈的票。在出发前,室友确认请到假了,于是变成了三个人的行程。 体验奥克兰家庭生活。
继续阅读“奥克兰体验生活一周”

hive failed renaming s3 table with error "New location for this table already exist"

Issue

- In hive-cli, rename table with command:

hive> alter table large_table_bk rename to large_table;

- 10 minutes later, it prompts error.

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. New location for this table default.large_table already exists : s3://feichashao-hadoop/warehouse/large_table

- However, before executing the "rename" command, the directory was not exist in S3, so we don't expect such an error.
继续阅读“hive failed renaming s3 table with error "New location for this table already exist"”

hive-server2 remains 2k+ s3n-worker threads after job finished


tldr: This is a bug in EMR 5.6. Upgrading to EMR 5.8 or above can solve the issue.

Issue

User reported that he sees 2000+ s3n-worker threads after job finished. He has to restart the hive-server2 service everyday to mitigate the issue.

# sudo -u hive jstack 11089 | grep s3n-worker | wc -l
2000

The threads are repeating from s3n-worker-0 to s3n-worker-19. In another word, there are 100 * 20 s3n-worker threads.

"s3n-worker-19" #70 daemon prio=5 os_prio=0 tid=0x00007f5ac4cf0800 nid=0x10ad waiting on condition [0x00007f5ac1dee000]
......
"s3n-worker-1" #52 daemon prio=5 os_prio=0 tid=0x00007f5ac5462000 nid=0x109b waiting on condition [0x00007f5aca23f000]
"s3n-worker-0" #51 daemon prio=5 os_prio=0 tid=0x00007f5ac5480000 nid=0x109a waiting on condition [0x00007f5aca641000]
......

Environment

AWS EMR 5.6
继续阅读“hive-server2 remains 2k+ s3n-worker threads after job finished”

Spark RDD checkpoint on S3 exits with exception intermittently

Issue

- Run a spark job and save RDD checkpoint to S3.
- Spark job failed intermittently with below error:

org.apache.spark.SparkException: Checkpoint RDD has a different number of partitions from original RDD. Original RDD [ID: xxx, num of partitions: 6]; Checkpoint RDD [ID: xxx, num of partitions: 5].

继续阅读“Spark RDD checkpoint on S3 exits with exception intermittently”