BD RELEASE NOTES - Store & Retrieve Data Anywhere · Release Notes ii Copyright © 2016, ... Cnode...

BlueData EPIC

EPIC ENTERPRISE GA 2.4EPIC LITE GA 2.4

RELEASE NOTES

Release Notes

ii Copyright © 2016, BlueData Software, Inc. ALL RIGHTS RESERVED.

NoticeBlueData Software, Inc. believes that the information in this publica-

tion is accurate as of its publication date. However, the information is

subject to change without notice. THE INFORMATION IN THIS

PUBLICATION IS PROVIDED “AS IS.” BLUEDATA SOFTWARE,

INC. MAKES NO REPRESENTATIONS OR WARRANTIES OF

ANY KIND WITH RESPECT TO THE INFORMATION IN THIS

PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WAR-

RANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTIC-

ULAR PURPOSE.

Use, copying, or distribution of any BlueData software described in

this publication requires an applicable software license.

For the most up-to-date regulatory document for your product line,

please refer to your specific agreements or contact BlueData Tech-

nical Support at [email protected].

The information in this document is subject to change. This manual is

believed to be complete and accurate at the time of publication and

no responsibility is assumed for any errors that may appear. In no

event shall BlueData Software, Inc. be liable for incidental or conse-

quential damages in connection with or arising from the use of this

manual and its accompanying related materials.

Copyrights and TrademarksPublished September, 2016. Printed in the United States of America.

Copyright 2016 by BlueData Software, Inc. All rights reserved. This

book or parts thereof may not be reproduced in any form without the

written permission of the publishers.

EPIC, EPIC Lite, and BlueData are trademarks of BlueData Software,

Inc. All other trademarks are the property of their respective own-

ers.

Contact InformationBlueData Software, Inc.

3979 Freedom Circle, Suite 850

Santa Clara, California 95054

Email: [email protected]

Website: www.bluedata.com

Release Notes

This manual describes the known issues and workarounds in the

following builds:

• EPIC 2.4 GA

• EPIC Lite 2.4 GA

CAUTION: THESE RELEASE NOTES ARE NOT VALID FOR

ANY VERSION OF EPIC OR EPIC LITE OTHER THAN THE

VERSION(S) LISTED HERE.

1Copyright © 2016, BlueData Software, Inc. ALL RIGHTS RESERVED.

BlueData EPIC and EPIC Lite

1.1 - New Features

The listed builds of EPIC and EPIC Lite include the new features

described in this section.

1.1.1 - Setup

DVR Networking

It is no longer necessary to create static routes when installing EPIC.

Further, any static routes created when installing previous versions

of EPIC must be removed.

Installation now supports restricted sudo privileges

The user installing EPIC no longer needs to have root or full sudo

access. See the About EPIC Guide for a list of restricted sudo

permissions required for installing EPIC.

Consolidated container storage pools

The /root and /data storage is now drawn from the same pool

in the container.

HDFS 2.7.2 support

EPIC 2.4 includes support for HDFS 2.7.2

1.1.2 - Site Administration

Optional per-tenant KDC specification

An external KDC service may be specified for each tenant. This KDC

can be used by specific App Store entries to provide Kerberos

protection for a cluster's services.

Kerberos support added to clusters

Clusters can now be Kerberized using either Active Directory or MIT

KDC.

Expanded per-node container limit

Each node can now support up to 100 containers (from 30).

Enhanced LDAP/AD group membership features

LDAP/AD group functionality has been enhanced with parameter

validation. See Appendix A of the User/Administrator Guide.

2 Copyright © 2016, BlueData Software, Inc. ALL RIGHTS RESERVED.

Release Notes

Automatic refresh LDAP/AD account settings on virtual nodes

Changes to either the tenant authentication groups or the User

Authentication settings in EPIC will be automatically pushed to all

affected clusters that employ LDAP/AD user accounts.

1.1.3 - Operation

Cnode memory leak fixes

Cnode memory leaks have been fixed.

Cnode memory leak fixes

Some App Store entries now support the ability to use the external

KDC defined for a tenant to add Kerberos protection to a cluster's

services within that tenant.

Tenant Member login restrictions for clusters with Edge nodes

If a cluster employing LDAP/AD user access is deployed with Edge

nodes, then user accounts will still be present on all cluster nodes;

however, Tenant Member user access will be restricted to the Edge

nodes.



1.2 - Resolved Issues

The following issues have been resolved in the listed builds of EPIC

and EPIC Lite.

1.2.1 - Site Administration

Can't move external users between tenants (HAATHI-11821)

Removing an external user from all tenants that user was part of

(meaning that the affected user is not part of any tenants) deletes

this external user from the EPIC user database. This behavior is

consistent with internal users as well. However, the EPIC UI does

not refresh the Assign Users screen, meaning that this user might

still be visible. Trying to add this user will fail with a 403: Forbidden

error because that user has already been deleted.

Workaround: Manually refresh the Assign Users screen before

reassigning the user to a different tenant. The user will disappear if

they are not part of any tenants and will need to be re-added before

assigning them to another tenant.

1.2.2 - Operation (Kafka)

Kafka services may be down after several reboots (HAATHI-

11841)

Workaround: Restart services manually by logging in via SSH to the

container where the service is down (as shown in the Service(s)

Status tab of the <Cluster> screen for the affected cluster and then

executing the command sudo service kafka_broker start.

Kafka services may remain in a 'critical' state for a few minutes

after creating a cluster (HAATHI-11832)

Workaround: Wait approximately five (5) minutes for the statuses to

become green.


Release Notes

1.3 - Known Issues & Workarounds

This section lists the known issues that are present in the listed

versions of EPIC and EPIC Lite and methods to work around/recover

from these issues.

Clusters created prior to EPIC 2.3 may need to be restarted

(HAATHI-11926)

Clusters created using EPIC versions prior to 2.3 mail fail when

performing an operation such as expansion or job launch. If this

happens, all of the services will appear red on the Services Status

tab of the <Cluster> screen for the affected cluster(s).

Workaround: Restart the affected cluster(s) after completing the

upgrade to EPIC 2.4.

Manual Hadoop commands from a non-root shell on the

controller host must use sudo (DOC-13)

Since EPIC now supports installation as a non-root user, Site

Administrators may more commonly need to execute Hadoop

commands from a non-root shell on the EPIC controller host. You

must use sudo to run such commands successfully as non-root.

(This is true regardless of whether EPIC was installed as root.)

Workaround: As an example, sudo -nE -u hdfs hadoop fs -ls / will list the root directory of the local HDFS, if run as a user

that has sudo privileges.

Executing “service network restart” on a host makes its nodes

inaccessible (HAATHI-11001)

A root user running “service network restart” in the host OS shell of

any EPIC host will destroy necessary components of the virtual

network used by virtual cluster nodes. Any virtual node assigned to

that host will then become inaccessible through the network.

Recovery: On the EPIC Controller host, execute these commands to

restart the management service and rebuild the virtual network

(ideally when the system is in Lockdown mode with no management

operations in progress):

/sbin/stop bds-controller/sbin/start bds-controller

After platform HA failover under load, HA Status is amber

(DOC-10)

If the HA Status service indicator is showing an amber (warning)

color, platform HA may not be able to protect against further failures.

A known cause of this status is when the Pacemaker service fails

during platform HA failovers under high load. This condition can be

identified by executing the following command in a root shell on the

current active EPIC Controller host:

pcs status



If the output of this command shows any Failed actions, then a

Pacemaker failure is the cause of the platform HA warning status.

Recovery: In the root shell of the Controller host, execute the

following command to restore Pacemaker:

crm_resource -P

1.3.1 - Operation (General)

Hive jobs that use DataTap paths may fail with a

SemanticException error (HAATHI-10733)

When Hive creates a table, the location where the table metadata is

stored comes from the Hive configuration parameter fs.defaultFS by

default (which will point to the cluster filesystem). If a Hive job

references DataTap paths outside of the filesystem where the table

metadata is stored, then the job will fail with a SemanticException

error, because Hive enforces that all data sources come from the

same filesystem.

Workaround: Explicitly set the table metadata location to a path on

the same DataTap that you will use for the job inputs and/or outputs,

using the LOCATION clause when creating the table. For example, if

you intend to use the TenantStorage DataTap, you would set the

table metadata location to some path on that DataTap such as:

CREATE TABLE docs (c1 INT, c2 STRING) LOCATION‘dtap://TenantStorage/hive-table-docs’

Some URLs in a cluster use hostnames instead of IP addresses

(HAATHI-10155)

Each virtual node created by EPIC is assigned a host name. These

host names are only known to other virtual nodes within the EPIC

platform. Client computers outside the EPIC platform may wish to

follow web links to services within the virtual nodes, but those clients

cannot resolve links that are based on the virtual node host names

instead of IP addresses.

Workaround: Users must configure the hosts file on the client

computer that will be accessing the virtual nodes in a given cluster.

To update the /etc/hosts file on a Linux system:

1. After creating a cluster, navigate to the Cluster Management

screen.

For each cluster, there will be a purple Hosts File Info icon

(screen) that will provide the template for setting up the /etc/

hosts file.


Release Notes

The template file should look like this:

2. Copy the data from this template to your /etc/hosts file. After a

cluster is deleted, delete these entries from the /etc/hosts file as

well.

To update the /etc/hosts file on a Windows 8/8.1/10 system:

1. Click Start and then type Notepad in the Search box.

2. Right-click the Notepad icon that appears and then select Run

as Administrator.

3. Follow the Windows 7 procedure below, starting at Step 3.

To update the /etc/hosts file on a Windows 7 system:

1. Click Start>All Programs>Accessories.

2. Right click Notepad and select Run as administrator.

3. If applicable, click Continue in the Windows needs your

permission UAC window.

4. In Notepad, click File>Open.

5. In the File Name field, type

C:\Windows\System32\Drivers\etc\hosts, and then

click Open.

To update the /etc/hosts file on a Windows NT/2000/XP system:

1. Click Start>All Programs>Accessories>Notepad.

2. In Notepad, click File>Open.

3. C:\Windows\System32\Drivers\etc\hosts, and then

click Open.

The C:\Windows\System32\Drivers\etc\hosts file is a hidden

file, and you will therefore need to enable the hidden files folder

option as shown below:



1.3.2 - Operation (HDP)

Hive on Tez queries fail due to a missing configuration,

workaround is available (HAATHI-11804)

This issue is caused because the file bluedata.dtap.jar is not

in the additional class path for Tez.

Workaround:

1. On the Ambari Dashboard screen, select Tez.

2. On right hand side, in the Filter textbox, search for

tez.cluster.additional.classpath.prefix.

3. Edit the value of this property to add the location of dtap.jar:

/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure:/opt/bluedata/bluedata-dtap.jar

4. Restart Tex, Hive, and Oozie

1.3.3 - Operation (CDH with

Cloudera Manager)

Cloudera Manager reports incorrect values for a node's

resources (DOC-9)

Cloudera Manager accesses the Linux /proc filesystem to determine

the characteristics of the node(s) it is managing. Because container

technology is used to implement virtual nodes, this filesystem reports

information about the host rather than about the individual node,

causing Cloudera Manager to report inflated values for a node's CPU

count, memory, and disk.

Workaround: Use the EPIC web interface to see a node's virtual

hardware configuration (flavor).

Hue wizard on CDH warns that no Job Server is running for

Spark (HAATHI-10737)

For Cloudera virtual clusters, the Hue Quick Start Wizard shows a

Spark Editor warning that The app won't work without a running

Job Server.

Workaround: This message is expected. Spark jobs can still be run

on Cloudera virtual clusters if Spark support was selected at cluster

creation time, as documented in the Running Applications Guide.

Verify that Cloudera Manager is up and running after an upgrade

and before attempting to expand a cluster (HAATHI-10898)

Cloudera Manager must be up and running after upgrading to EPIC

2.4 before attempting to expand a cluster.

Workaround: Verify that Cloudera Manager is running on all affected

cluster(s) before attempting to expand the cluster(s). If needed,

restart the cluster(s).


Release Notes

1.3.4 - Operation (HDP with Ambari)

HDP/Ambari cluster services must be manually restarted after

reboot.

This is an Ambari issue, not EPIC.

Ambari dashboard does not show YARN or Flume service

metrics (HAATHI-10667)

The Ambari YARN and Flume summary dashboards of an HDP 2.2 +

Ambari cluster will display No data for service metrics.

Workaround: Some metrics are shown on the top-level Ambari

dashboard. Some other metrics can be seen on the cluster's Ganglia

dashboard; in the <Cluster> screen on the EPIC interface, select the

Charts tab, and then click the Ganglia Dashboard link at the bottom

of the page; however there is not currently a complete workaround

for this issue.

1.3.5 - Operation (Spark)

Zeppelin tutorial example fails with file not found exception

(HAATHI-11562)

The first attempt to run the Zeppelin tutorial example may fail with a

message claiming that the file /data/bank-full.csv does not exist. The

file does exist; this behavior appears to be a Zeppelin issue.

Workaround: Run the example again, with different query values.

Usually this will get past the error. If not, then copy the csv file to a

location on a DataTap, change the example to reference that path,

and try again.

Thrift server logs report “No data or no sasl data in the stream”

(HAATHI-11563)

You may observe the “No data or no sasl data in the stream”

message in the thrift server logs in the $SPARK_HOME/logs/

directory; however this does not indicate an issue with the thrift

server functionality. You may ignore this message.

Workaround: None required.

Spark applications will wait indefinitely if no free vCPUs (DOC-

19)

This is a general Spark behavior, but it is worth some emphasis in an

environment where various virtual hardware resources (possibly in

small amounts) can be quickly provisioned for use with Spark: a

Spark application will be stuck in the Waiting state if all vCPUs in the

cluster are already considered to be in-use (by the Spark framework

and other running Spark applications).

Workaround: You can always get more vCPUs by increasing the

cluster Worker count. If this is not desirable, however, you will need

to control the consumption of vCPUs by Spark applications. Two

examples of such control:

• You can restrict the number of vCPUs that any future Spark

application can consume by assigning a value to the

spark.cores.max property in $SPARK_HOME/conf/spark-



defaults.conf and restarting the Spark master (sudo service

spark-master restart).

• By default in Spark 1.5, the thrift server is configured to use 2

vCPUs on the Spark master node. You can reduce this to 1 vCPU

by editing the total-executor-cores argument value in the /etc/

init.d/hive-thriftserver script, and then restarting the thrift server

(sudo service hive-thriftserver restart).

1.3.6 - Operation (MapR)

Running Jobs

TestDFSIO does not support DataTaps as the output filesystem.

Running a job against a DataTap will fail with the error ETERM:

Permission denied error.

Workaround: Run the job again. Subsequent job runs will complete

successfully. Jobs must be run using the command line; running

them from the EPIC UI is not supported.

Drill may be down

Enabled security features may cause Drill to be down because no

user has a ticket.

Workaround: If Drill is down, execute the following command from

the command line:

maprcli node services -name drill-bits -action restart -nodes

Job submission is not supported through the EPIC UI.

Workaround: Run all jobs through the Command Line Interface (CLI).

MapR Apps and Features Supported by EPIC

• MaprFS

• MapRDB

• MapR Tickets for security

• MapR Streams

• MapR Control System

• YARN HA

• Apache Drill

• Pig

• Hive

• Oozie

• Zookeeper

• Cluster Expansion


Release Notes

MapR Apps and Features Not Currently Supported by EPIC

• HBase

• Hue

• HttpFS

• Kerberos

• CLDB HA

• Cluster Shrinking

• Sqoop

• Flume

• Spark


EPIC and EPIC Lite Release Notes 2.4 (09/2016)

This book or parts thereof may not be reproduced in any form with-

out the written permission of the publishers. Printed in the United

States of America. Copyright 2016 by BlueData Software, Inc. All

rights reserved.

Contact Information:

BlueData Software, Inc.

3979 Freedom Circle, Suite 850

Santa Clara, California 95054

Email: [email protected]

Website: www.bluedata.com

BD RELEASE NOTES - Store & Retrieve Data Anywhere · Release Notes ii Copyright © 2016, ... Cnode...

Documents

Transcript of BD RELEASE NOTES - Store & Retrieve Data Anywhere · Release Notes ii Copyright © 2016, ... Cnode...