AWS EMR: Hue cannot connect to Hive with custom authentication

I am facing an issue when creating an EMR cluster on AWS. Currently, I have a CloudFormation (CF) script that is invoked by Lambda to create a cluster — this includes steps to setup authentication for Hue, Hive, and other applications. For the purpose of the project I have created a custom authentication for Hive and configured it through the CF script like so:

Configurations:
    - Classification: hive-site
      ConfigurationProperties:
         hive.metastore.client.factory.class: 'com.amazonaws.glue.catalog.metastore.ClientFactory'
         hive.server2.authentication: 'CUSTOM'
         hive.server2.custom.authentication.class: 'com.common.code.auth.CustomAuthClass'

When I log into Hue on the cluster, none of my tables from Hive are loaded and I get the following error:

TSocket read 0 bytes (code THRIFTTRANSPORT):
TTransportException(‘TSocket read 0 bytes’,)

I ensured the Hive server was running. So this is not the issue. I even tried spinning up a cluster without Hive authentication setup and all my Hive tables were populated, so it must be something to do with the Hive custom authentication. I have played around with some of the hue.ini settings, but didn’t have any success. Does anyone have any advice or know if this is not possible?

If you use a custom auth, you probably need to tweak the Thrift Client in Hue to support this auth (https://github.com/cloudera/hue/blob/master/apps/beeswax/src/beeswax/server/hive_server2_lib.py#L567)

1 Like