肥叉烧 feichashao.com – 第5页

奥克兰体验生活一周

有好几个前同事到奥克兰工作了，所以我计划体验一下奥克兰的生活。查看机票价格，海航往返含税2810元，深圳中转可以顺便回趟家，值得入手。行程如下：

2019.6.21 PEK-SZX
2019.6.22 SZX-AKL
2019.6.29 AKL-SZX
2019.7.01 SZX-PEK

在 Airbnb 预定了市中心的公寓，两个房间三张床，有客厅厨房，阳台可以看到海景。7晚连同手续费大概5800元人民币。大多数人来奥克兰只是中转，通常会转机到南岛。我们没车到处玩耍，就安心地在奥克兰体验生活吧。

买票的时候想着室友大概率请不到假，我很可能要一个人去，所以还买了我妈的票。在出发前，室友确认请到假了，于是变成了三个人的行程。体验奥克兰家庭生活。
继续阅读“奥克兰体验生活一周”

安装并使用 xrdp 连接 ubuntu 桌面

公司的电脑是 Windows 10, 不给管理员权限，什么都干不了。在不买新电脑的前提下，只好通过远程连接到 AWS 的 Ubuntu 桌面上工作。记录一下安装方法。

环境

- AWS
- Ubuntu 18.04
继续阅读“安装并使用 xrdp 连接 ubuntu 桌面”

hive failed renaming s3 table with error "New location for this table already exist"

Issue

- In hive-cli, rename table with command:

hive> alter table large_table_bk rename to large_table;

- 10 minutes later, it prompts error.

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. New location for this table default.large_table already exists : s3://feichashao-hadoop/warehouse/large_table

- However, before executing the "rename" command, the directory was not exist in S3, so we don't expect such an error.
继续阅读“hive failed renaming s3 table with error "New location for this table already exist"”

秦皇岛旁观马拉松计划

记个流水账。室友去秦皇岛跑马拉松，他说我可以去蹭住酒店，所以我就去了。

买票晚了，去程能买到周六白天的火车票，回程就只剩动车一等座了，好奢侈。

早起前往北京站，坐上火车，出发。

继续阅读

深圳机场可以提前值机托运吗？

深圳机场海南航空可以提前托运行李吗？东海航空可以提前托运行李吗？

因为网上搜不到答案，所以自问自答。
继续阅读“深圳机场可以提前值机托运吗？”

hive-server2 remains 2k+ s3n-worker threads after job finished

tldr: This is a bug in EMR 5.6. Upgrading to EMR 5.8 or above can solve the issue.

Issue

User reported that he sees 2000+ s3n-worker threads after job finished. He has to restart the hive-server2 service everyday to mitigate the issue.

# sudo -u hive jstack 11089 | grep s3n-worker | wc -l
2000

The threads are repeating from s3n-worker-0 to s3n-worker-19. In another word, there are 100 * 20 s3n-worker threads.

"s3n-worker-19" #70 daemon prio=5 os_prio=0 tid=0x00007f5ac4cf0800 nid=0x10ad waiting on condition [0x00007f5ac1dee000]
......
"s3n-worker-1" #52 daemon prio=5 os_prio=0 tid=0x00007f5ac5462000 nid=0x109b waiting on condition [0x00007f5aca23f000]
"s3n-worker-0" #51 daemon prio=5 os_prio=0 tid=0x00007f5ac5480000 nid=0x109a waiting on condition [0x00007f5aca641000]
......

Environment

AWS EMR 5.6
继续阅读“hive-server2 remains 2k+ s3n-worker threads after job finished”

Spark RDD checkpoint on S3 exits with exception intermittently

Issue

- Run a spark job and save RDD checkpoint to S3.
- Spark job failed intermittently with below error:

org.apache.spark.SparkException: Checkpoint RDD has a different number of partitions from original RDD. Original RDD [ID: xxx, num of partitions: 6]; Checkpoint RDD [ID: xxx, num of partitions: 5].

继续阅读“Spark RDD checkpoint on S3 exits with exception intermittently”