Pivotal Greenplum DatabaseContents Release Notes 7 Memory and Resource Management.....542

1471
PRODUCT DOCUMENTATION Pivotal Greenplum Database ® Version 5.0.0 Pivotal Greenplum Database Documentation Rev: A08 © 2017 Pivotal Software, Inc.

Transcript of Pivotal Greenplum DatabaseContents Release Notes 7 Memory and Resource Management.....542

  • PRODUCT DOCUMENTATION

    Pivotal™ GreenplumDatabase®Version 5.0.0

    Pivotal Greenplum DatabaseDocumentationRev: A08

    © 2017 Pivotal Software, Inc.

  • Copyright Release Notes

    2

    Notice

    Copyright

    Privacy Policy | Terms of Use

    Copyright © 2017 Pivotal Software, Inc. All rights reserved.

    Pivotal Software, Inc. believes the information in this publication is accurate as of its publication date. Theinformation is subject to change without notice. THE INFORMATION IN THIS PUBLICATION IS PROVIDED"AS IS." PIVOTAL SOFTWARE, INC. ("Pivotal") MAKES NO REPRESENTATIONS OR WARRANTIES OF ANYKIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMSIMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

    Use, copying, and distribution of any Pivotal software described in this publication requires an applicablesoftware license.

    All trademarks used herein are the property of Pivotal or their respective owners.

    Revised September 2017 (5.0.0)

    http://pivotal.io/privacy-policyhttp://pivotal.io/terms-of-use

  • Contents Release Notes

    3

    Contents

    Chapter 2: Pivotal Greenplum 5.0.0 Release Notes............................... 14Welcome to Pivotal Greenplum 5.0.0............................................................................................... 15New Features.................................................................................................................................... 16

    PostgreSQL Core Features.................................................................................................... 16Python 2.7...............................................................................................................................19gpdbrestore Support for CASTs.............................................................................................19Enhanced Session State Monitoring...................................................................................... 19Python Data Science Module Package..................................................................................19R Data Science Library Package........................................................................................... 20COPY Command ON SEGMENT Clause.............................................................................. 20

    Experimental Features...................................................................................................................... 21Changed Features.............................................................................................................................22

    GPORCA as Default Optimizer.............................................................................................. 22Legacy Optimizer Changes.................................................................................................... 22Escape Characters in String Literals......................................................................................22Implicit Casting of Text...........................................................................................................23Logging for Formatting Errors................................................................................................ 23ANALYZE Command..............................................................................................................23Query Dispatcher....................................................................................................................24Datatype Storage....................................................................................................................24S3 and Custom Protocols...................................................................................................... 24System Catalog...................................................................................................................... 24pg_dumpall Utility................................................................................................................... 24psql Utility............................................................................................................................... 25gpstart Utility........................................................................................................................... 25PL/Python Multi-Dimensional Arrays...................................................................................... 25Updatable Cursors..................................................................................................................25Creating Trusted Languages.................................................................................................. 25pgcrypto Removes FIPS Support...........................................................................................25Locking for ALTER TABLE RENAME.................................................................................... 25Free Tuple Handling for Persistent Tables............................................................................ 25Transaction ID Assignment.................................................................................................... 26Default Master Data Directory for Utilities..............................................................................26Independent relfilenode and OID........................................................................................... 26Partner Connector.................................................................................................................. 26GPORCA Supports Indexes on Leaf Child Partitions............................................................ 26Requirement for ZLIB Compression.......................................................................................26

    Removed and Deprecated Features.................................................................................................27Differences Compared to Open Source Greenplum Database.........................................................28Supported Platforms..........................................................................................................................29

    Veritas NetBackup.................................................................................................................. 29Supported Platform Notes...................................................................................................... 30

    Pivotal Greenplum Tools and Extensions Compatibility................................................................... 31Client Tools.............................................................................................................................31Extensions...............................................................................................................................31

    Pivotal GPText Compatibility.............................................................................................................33Hadoop Distribution Compatibility..................................................................................................... 34Migrating Data to Pivotal Greenplum 5.0.0.......................................................................................35Resolved Issues................................................................................................................................ 36

  • Contents Release Notes

    4

    Known Issues and Limitations.......................................................................................................... 37

    Chapter 4: Greenplum Database Installation Guide...............................42Introduction to Greenplum.................................................................................................................43

    The Greenplum Master...........................................................................................................43The Segments........................................................................................................................ 44The Interconnect.....................................................................................................................45ETL Hosts for Data Loading.................................................................................................. 46Greenplum Performance Monitor........................................................................................... 47

    Estimating Storage Capacity............................................................................................................. 48Calculating Usable Disk Capacity.......................................................................................... 48Calculating User Data Size.................................................................................................... 48Calculating Space Requirements for Metadata and Logs......................................................49

    Configuring Your Systems and Installing Greenplum....................................................................... 50System Requirements.............................................................................................................50Setting the Greenplum Recommended OS Parameters........................................................ 52Installing the Greenplum Database Software.........................................................................56Installing and Configuring Greenplum on all Hosts................................................................57Installing Oracle Compatibility Functions............................................................................... 59Installing Optional Modules.................................................................................................... 59Installing Greenplum Database Extensions........................................................................... 60Creating the Data Storage Areas...........................................................................................60Synchronizing System Clocks................................................................................................ 61Enabling iptables.................................................................................................................... 62Amazon EC2 Configuration (Amazon Web Services)............................................................65Next Steps.............................................................................................................................. 70

    Installing the Data Science Packages.............................................................................................. 71Python Data Science Module Package..................................................................................71R Data Science Library Package........................................................................................... 73

    Validating Your Systems................................................................................................................... 76Validating OS Settings............................................................................................................76Validating Hardware Performance..........................................................................................76Validating Disk I/O and Memory Bandwidth...........................................................................77

    Configuring Localization Settings...................................................................................................... 79About Locale Support in Greenplum Database..................................................................... 79Character Set Support............................................................................................................80Setting the Character Set.......................................................................................................83Character Set Conversion Between Server and Client..........................................................83

    Initializing a Greenplum Database System.......................................................................................86Overview................................................................................................................................. 86Initializing Greenplum Database.............................................................................................86Setting Greenplum Environment Variables............................................................................ 89Next Steps.............................................................................................................................. 90

    Installation Management Utilities.......................................................................................................91Greenplum Environment Variables................................................................................................... 92

    Required Environment Variables............................................................................................92Optional Environment Variables............................................................................................. 92

    Chapter 6: Greenplum Database Administrator Guide.......................... 94Greenplum Database Concepts........................................................................................................ 95

    About the Greenplum Architecture.........................................................................................95About Management and Monitoring Utilities.......................................................................... 97About Concurrency Control in Greenplum Database.............................................................98About Parallel Data Loading................................................................................................ 106

  • Contents Release Notes

    5

    About Redundancy and Failover in Greenplum Database...................................................107About Database Statistics in Greenplum Database............................................................. 109

    Managing a Greenplum System..................................................................................................... 116Starting and Stopping Greenplum Database....................................................................... 116Accessing the Database.......................................................................................................118Configuring the Greenplum Database System.....................................................................130Enabling High Availability and Data Consistency Features................................................. 140Backing Up and Restoring Databases................................................................................. 159Expanding a Greenplum System..........................................................................................181Migrating Data with gptransfer............................................................................................. 196Monitoring a Greenplum System..........................................................................................202Routine System Maintenance Tasks....................................................................................222Recommended Monitoring and Maintenance Tasks............................................................ 226

    Managing Greenplum Database Access.........................................................................................234Configuring Client Authentication......................................................................................... 234Managing Roles and Privileges............................................................................................260

    Defining Database Objects..............................................................................................................267Creating and Managing Databases......................................................................................267Creating and Managing Tablespaces...................................................................................269Creating and Managing Schemas........................................................................................272Creating and Managing Tables............................................................................................ 273Choosing the Table Storage Model..................................................................................... 276Partitioning Large Tables......................................................................................................287Creating and Using Sequences........................................................................................... 299Using Indexes in Greenplum Database............................................................................... 301Creating and Managing Views............................................................................................. 304

    Inserting, Updating, and Deleting Data...........................................................................................306About Concurrency Control in Greenplum Database...........................................................306Inserting Rows...................................................................................................................... 307Updating Existing Rows........................................................................................................308Deleting Rows.......................................................................................................................308Working With Transactions...................................................................................................308Vacuuming the Database..................................................................................................... 310

    Querying Data................................................................................................................................. 311About Greenplum Query Processing....................................................................................311About GPORCA....................................................................................................................314Defining Queries................................................................................................................... 326WITH Queries (Common Table Expressions)......................................................................335Using Functions and Operators............................................................................................338Working with JSON Data..................................................................................................... 347Working with XML Data........................................................................................................351Query Performance.............................................................................................................. 363Managing Spill Files Generated by Queries........................................................................ 363Query Profiling...................................................................................................................... 363

    Working with External Data.............................................................................................................369Defining External Tables...................................................................................................... 369Accessing HDFS Data with gphdfs...................................................................................... 386Using the Greenplum Parallel File Server (gpfdist)..............................................................409

    Loading and Unloading Data.......................................................................................................... 413Loading Data Using an External Table................................................................................ 414Loading and Writing Non-HDFS Custom Data.................................................................... 414Handling Load Errors............................................................................................................417Loading Data with gpload.....................................................................................................419Loading Data with COPY..................................................................................................... 420Running COPY in Single Row Error Isolation Mode............................................................420Optimizing Data Load and Query Performance................................................................... 421

  • Contents Release Notes

    6

    Unloading Data from Greenplum Database......................................................................... 421Transforming XML Data....................................................................................................... 424Formatting Data Files........................................................................................................... 433Example Custom Data Access Protocol.............................................................................. 436

    Managing Performance................................................................................................................... 443Defining Database Performance.......................................................................................... 443Common Causes of Performance Issues............................................................................ 444Greenplum Database Memory Overview............................................................................. 447Managing Resources............................................................................................................450Investigating a Performance Problem.................................................................................. 471

    Chapter 8: Greenplum Database Security Configuration Guide......... 474Securing the Database....................................................................................................................475Greenplum Database Ports and Protocols..................................................................................... 476Configuring Client Authentication.................................................................................................... 480

    Allowing Connections to Greenplum Database....................................................................480Editing the pg_hba.conf File.................................................................................................481Authentication Methods........................................................................................................ 482SSL Client Authentication.....................................................................................................484PAM Based Authentication...................................................................................................486Radius Authentication...........................................................................................................487Limiting Concurrent Connections......................................................................................... 487Encrypting Client/Server Connections..................................................................................488

    Configuring Database Authorization................................................................................................489Access Permissions and Roles............................................................................................489Managing Object Privileges..................................................................................................489Using SSH-256 Encryption...................................................................................................490Restricting Access by Time..................................................................................................492Dropping a Time-based Restriction.................................................................................... 494

    Greenplum Command Center Security........................................................................................... 495Auditing............................................................................................................................................ 498Encrypting Data and Database Connections.................................................................................. 503

    Encrypting gpfdist Connections............................................................................................ 503Encrypting Data at Rest with pgcrypto.................................................................................504

    Enabling gphdfs Authentication with a Kerberos-secured Hadoop Cluster.....................................512Prerequisites......................................................................................................................... 512Configuring the Greenplum Cluster......................................................................................512Creating and Installing Keytab Files.................................................................................... 513Configuring gphdfs for Kerberos.......................................................................................... 514Testing Greenplum Database Access to HDFS...................................................................515Troubleshooting HDFS with Kerberos..................................................................................516

    Security Best Practices................................................................................................................... 518

    Chapter 10: Greenplum Database Best Practices................................ 522Best Practices Summary.................................................................................................................523System Configuration...................................................................................................................... 529Schema Design............................................................................................................................... 533

    Data Types........................................................................................................................... 533Storage Model...................................................................................................................... 533Compression......................................................................................................................... 534Distributions.......................................................................................................................... 535Partitioning............................................................................................................................ 538Indexes..................................................................................................................................540Column Sequence and Byte Alignment............................................................................... 540

  • Contents Release Notes

    7

    Memory and Resource Management..............................................................................................542System Monitoring and Maintenance..............................................................................................546

    Monitoring............................................................................................................................. 546Updating Statistics with ANALYZE.......................................................................................547Managing Bloat in the Database..........................................................................................548Monitoring Greenplum Database Log Files..........................................................................552

    Loading Data................................................................................................................................... 554INSERT Statement with Column Values..............................................................................554COPY Statement.................................................................................................................. 554External Tables.....................................................................................................................554External Tables with Gpfdist................................................................................................ 554Gpload...................................................................................................................................555Best Practices.......................................................................................................................556

    Migrating Data with Gptransfer....................................................................................................... 557Security............................................................................................................................................ 562Encrypting Data and Database Connections.................................................................................. 565Accessing a Kerberized Hadoop Cluster........................................................................................ 574

    Prerequisites......................................................................................................................... 574Configuring the Greenplum Cluster......................................................................................574Creating and Installing Keytab Files.................................................................................... 575Configuring gphdfs for Kerberos.......................................................................................... 576Testing Greenplum Database Access to HDFS...................................................................577Troubleshooting HDFS with Kerberos..................................................................................578

    Tuning SQL Queries....................................................................................................................... 580How to Generate Explain Plans........................................................................................... 580How to Read Explain Plans................................................................................................. 580Optimizing Greenplum Queries............................................................................................ 582

    High Availability............................................................................................................................... 584Disk Storage......................................................................................................................... 584Master Mirroring....................................................................................................................584Segment Mirroring................................................................................................................ 585Dual Clusters........................................................................................................................ 586Backup and Restore.............................................................................................................586Detecting Failed Master and Segment Instances................................................................ 587Segment Mirroring Configuration..........................................................................................588

    Chapter 12: Greenplum Database Utility Guide................................... 594Management Utility Reference........................................................................................................ 595

    Backend Server Programs................................................................................................... 596analyzedb..............................................................................................................................597gpactivatestandby................................................................................................................. 601gpaddmirrors......................................................................................................................... 602gpcheck.................................................................................................................................606gpcheckcat............................................................................................................................ 608gpcheckperf...........................................................................................................................610gpconfig.................................................................................................................................613gpcrondump.......................................................................................................................... 617gpdbrestore........................................................................................................................... 630gpdeletesystem..................................................................................................................... 637gpexpand.............................................................................................................................. 638gpfdist....................................................................................................................................641gpfilespace............................................................................................................................ 644gpinitstandby......................................................................................................................... 647gpinitsystem.......................................................................................................................... 649gpload................................................................................................................................... 656

  • Contents Release Notes

    8

    gplogfilter...............................................................................................................................666gpmapreduce........................................................................................................................ 668gpmfr..................................................................................................................................... 670gpperfmon_install..................................................................................................................673gppkg.................................................................................................................................... 678gprecoverseg........................................................................................................................ 679gpreload................................................................................................................................ 684gpscp.....................................................................................................................................686gpseginstall........................................................................................................................... 687gpssh.....................................................................................................................................689gpssh-exkeys........................................................................................................................ 692gpstart................................................................................................................................... 694gpstate.................................................................................................................................. 696gpstop................................................................................................................................... 700gpsys1...................................................................................................................................702gptransfer.............................................................................................................................. 703pgbouncer............................................................................................................................. 714

    Client Utility Reference....................................................................................................................735Client Utility Summary.......................................................................................................... 735

    Oracle Compatibility Functions........................................................................................................785Installing Oracle Compatibility Functions............................................................................. 785Oracle and Greenplum Implementation Differences............................................................ 785Oracle Compatibility Functions Reference........................................................................... 786add_months.......................................................................................................................... 787bitand.................................................................................................................................... 787concat....................................................................................................................................788cosh.......................................................................................................................................789decode.................................................................................................................................. 789dump..................................................................................................................................... 792instr....................................................................................................................................... 792last_day.................................................................................................................................793listagg....................................................................................................................................794listagg (2).............................................................................................................................. 794lnnvl.......................................................................................................................................795months_between...................................................................................................................796nanvl......................................................................................................................................796next_day................................................................................................................................797next_day (2)..........................................................................................................................798nlssort....................................................................................................................................799nvl..........................................................................................................................................799nvl2........................................................................................................................................800oracle.substr..........................................................................................................................801reverse.................................................................................................................................. 802round..................................................................................................................................... 802sinh........................................................................................................................................803tanh....................................................................................................................................... 804trunc...................................................................................................................................... 805

    dblink Functions...............................................................................................................................807hstore Functions.............................................................................................................................. 809

    Chapter 14: Greenplum Database Reference Guide............................ 812SQL Command Reference..............................................................................................................813

    SQL Syntax Summary..........................................................................................................815ABORT..................................................................................................................................842ALTER AGGREGATE...........................................................................................................843

  • Contents Release Notes

    9

    ALTER CONVERSION......................................................................................................... 844ALTER DATABASE.............................................................................................................. 845ALTER DOMAIN...................................................................................................................846ALTER EXTENSION.............................................................................................................848ALTER EXTERNAL TABLE..................................................................................................850ALTER FILESPACE............................................................................................................. 852ALTER FUNCTION...............................................................................................................853ALTER GROUP.................................................................................................................... 855ALTER INDEX...................................................................................................................... 856ALTER LANGUAGE............................................................................................................. 857ALTER OPERATOR............................................................................................................. 858ALTER OPERATOR CLASS................................................................................................ 859ALTER OPERATOR FAMILY...............................................................................................859ALTER PROTOCOL............................................................................................................. 862ALTER RESOURCE GROUP.............................................................................................. 863ALTER RESOURCE QUEUE...............................................................................................865ALTER ROLE....................................................................................................................... 867ALTER SCHEMA..................................................................................................................871ALTER SEQUENCE............................................................................................................. 871ALTER TABLE......................................................................................................................873ALTER TABLESPACE..........................................................................................................883ALTER TYPE........................................................................................................................884ALTER USER....................................................................................................................... 885ALTER VIEW........................................................................................................................ 886ANALYZE..............................................................................................................................887BEGIN................................................................................................................................... 890CHECKPOINT.......................................................................................................................891CLOSE.................................................................................................................................. 891CLUSTER............................................................................................................................. 892COMMENT............................................................................................................................894COMMIT................................................................................................................................896COPY.................................................................................................................................... 896CREATE AGGREGATE........................................................................................................906CREATE CAST.....................................................................................................................910CREATE CONVERSION...................................................................................................... 912CREATE DATABASE........................................................................................................... 914CREATE DOMAIN................................................................................................................915CREATE EXTENSION..........................................................................................................916CREATE EXTERNAL TABLE...............................................................................................918CREATE FUNCTION............................................................................................................927CREATE GROUP................................................................................................................. 933CREATE INDEX................................................................................................................... 933CREATE LANGUAGE.......................................................................................................... 936CREATE OPERATOR.......................................................................................................... 939CREATE OPERATOR CLASS............................................................................................. 942CREATE OPERATOR FAMILY............................................................................................946CREATE PROTOCOL.......................................................................................................... 947CREATE RESOURCE GROUP........................................................................................... 948CREATE RESOURCE QUEUE............................................................................................950CREATE ROLE.................................................................................................................... 953CREATE RULE.....................................................................................................................958CREATE SCHEMA...............................................................................................................959CREATE SEQUENCE.......................................................................................................... 961CREATE TABLE...................................................................................................................963CREATE TABLE AS.............................................................................................................975CREATE TABLESPACE.......................................................................................................978

  • Contents Release Notes

    10

    CREATE TYPE.....................................................................................................................979CREATE USER.................................................................................................................... 984CREATE VIEW..................................................................................................................... 985DEALLOCATE...................................................................................................................... 987DECLARE............................................................................................................................. 987DELETE................................................................................................................................ 990DISCARD.............................................................................................................................. 992DO.........................................................................................................................................993DROP AGGREGATE............................................................................................................994DROP CAST.........................................................................................................................995DROP CONVERSION.......................................................................................................... 996DROP DATABASE............................................................................................................... 997DROP DOMAIN.................................................................................................................... 997DROP EXTENSION..............................................................................................................998DROP EXTERNAL TABLE...................................................................................................999DROP FILESPACE.............................................................................................................1000DROP FUNCTION..............................................................................................................1000DROP GROUP................................................................................................................... 1001DROP INDEX..................................................................................................................... 1002DROP LANGUAGE.............................................................................................................1002DROP OPERATOR............................................................................................................ 1003DROP OPERATOR CLASS............................................................................................... 1004DROP OPERATOR FAMILY.............................................................................................. 1005DROP OWNED...................................................................................................................1006DROP PROTOCOL............................................................................................................ 1007DROP RESOURCE GROUP..............................................................................................1007DROP RESOURCE QUEUE.............................................................................................. 1008DROP ROLE.......................................................................................................................1009DROP RULE.......................................................................................................................1010DROP SCHEMA................................................................................................................. 1011DROP SEQUENCE............................................................................................................ 1012DROP TABLE..................................................................................................................... 1012DROP TABLESPACE.........................................................................................................1013DROP TYPE....................................................................................................................... 1014DROP USER...................................................................................................................... 1015DROP VIEW....................................................................................................................... 1015END.....................................................................................................................................1016EXECUTE........................................................................................................................... 1016EXPLAIN............................................................................................................................. 1017FETCH................................................................................................................................ 1020GRANT................................................................................................................................1022INSERT............................................................................................................................... 1026LOAD.................................................................................................................................. 1028LOCK.................................................................................................................................. 1029MOVE..................................................................................................................................1032PREPARE........................................................................................................................... 1033REASSIGN OWNED...........................................................................................................1035REINDEX............................................................................................................................ 1035RELEASE SAVEPOINT......................................................................................................1037RESET................................................................................................................................ 1038REVOKE............................................................................................................................. 1038ROLLBACK......................................................................................................................... 1040ROLLBACK TO SAVEPOINT.............................................................................................1041SAVEPOINT........................................................................................................................1042SELECT.............................................................................................................................. 1043SELECT INTO.................................................................................................................... 1058

  • Contents Release Notes

    11

    SET..................................................................................................................................... 1059SET ROLE.......................................................................................................................... 1061SET SESSION AUTHORIZATION..................................................................................... 1062SET TRANSACTION.......................................................................................................... 1063SHOW................................................................................................................................. 1065START TRANSACTION..................................................................................................... 1066TRUNCATE.........................................................................................................................1067UPDATE..............................................................................................................................1068VACUUM.............................................................................................................................1071VALUES.............................................................................................................................. 1073

    SQL 2008 Optional Feature Compliance......................................................................................1076Greenplum Environment Variables............................................................................................... 1105

    Required Environment Variables........................................................................................1105Optional Environment Variables......................................................................................... 1105

    System Catalog Reference........................................................................................................... 1107System Tables.................................................................................................................... 1107System Views..................................................................................................................... 1108System Catalogs Definitions...............................................................................................1109

    The gp_toolkit Administrative Schema..........................................................................................1194Checking for Tables that Need Routine Maintenance........................................................1194Checking for Locks.............................................................................................................1195Checking Append-Optimized Tables.................................................................................. 1197Viewing Greenplum Database Server Log Files................................................................ 1201Checking Server Configuration Files..................................................................................1204Checking for Failed Segments........................................................................................... 1205Checking Resource Queue Activity and Status................................................................. 1206Checking Query Disk Spill Space Usage...........................................................................1208Viewing Users and Groups (Roles)....................................................................................1210Checking Database Object Sizes and Disk Space............................................................ 1211Checking for Uneven Data Distribution.............................................................................. 1215

    The gpperfmon Database..............................................................................................................1216database_*.........................................................................................................................1218diskspace_*....................................................................................................................... 1219filerep_*..............................................................................................................................1219interface_stats_*................................................................................................................ 1224log_alert_*..........................................................................................................................1225queries_*............................................................................................................................. 1227segment_*..........................................................................................................................1229socket_stats_*.....................................................................................................................1230system_*............................................................................................................................. 1231dynamic_memory_info........................................................................................................ 1232memory_info...................................................................................................................... 1233

    Greenplum Database Data Types.................................................................................................1235Character Set Support...................................................................................................................1239

    Setting the Character Set...................................................................................................1241Character Set Conversion Between Server and Client...................................................... 1241

    Server Configuration Parameters..................................................................................................1244Parameter Types and Values.............................................................................................1244Setting Parameters............................................................................................................. 1244Parameter Categories.........................................................................................................1245Configuration Parameters...................................................................................................1255

    Summary of Built-in Functions...................................................................................................... 1333Greenplum Database Function Types................................................................................1333Built-in Functions and Operators........................................................................................1334JSON Functions and Operators......................................................................................... 1337Window Functions.............................................................................................................. 1340

  • Contents Release Notes

    12

    Advanced Aggregate Functions......................................................................................... 1342Greenplum MapReduce Specification...........................................................................................1344

    Greenplum MapReduce Document Format........................................................................1344Greenplum MapReduce Document Schema......................................................................1345Example Greenplum MapReduce Document..................................................................... 1352

    Greenplum PL/pgSQL Procedural Language............................................................................... 1358About Greenplum Database PL/pgSQL............................................................................. 1358PL/pgSQL Plan Caching.....................................................................................................1360PL/pgSQL Examples...........................................................................................................1360References..........................................................................................................................1364

    Greenplum PostGIS Extension..................................................................................................... 1365About PostGIS.................................................................................................................... 1365Enabling and Removing PostGIS Support......................................................................... 1366Usage..................................................................................................................................1367PostGIS Extension Support and Limitations...................................................................... 1367PostGIS Support Scripts.....................................................................................................1369

    Greenplum PL/R Language Extension..........................................................................................1371About Greenplum Database PL/R......................................................................................1371

    Greenplum PL/Python Language Extension................................................................................. 1377About Greenplum PL/Python..............................................................................................1377Enabling and Removing PL/Python support...................................................................... 1377Developing Functions with PL/Python................................................................................1378Installing Python Modules...................................................................................................1381Examples............................................................................................................................ 1387References..........................................................................................................................1388

    Greenplum PL/Java Language Extension.....................................................................................1390About PL/Java.................................................................................................................... 1390About Greenplum Database PL/Java.................................................................................1391Installing PL/Java................................................................................................................1392Uninstalling PL/Java........................................................................................................... 1393Enabling PL/Java and Installing JAR Files........................................................................ 1394Writing PL/Java functions................................................................................................... 1394Using JDBC........................................................................................................................ 1400Exception Handling.............................................................................................................1400Savepoints.......................................................................................................................... 1400Logging............................................................................................................................... 1401Security............................................................................................................................... 1401Some PL/Java Issues and Solutions..................................................................................1402Example.............................................................................................................................. 1403References..........................................................................................................................1404

    Greenplum PL/Perl Language Extension......................................................................................1405About Greenplum PL/Perl...................................................................................................1405Greenplum Database PL/Perl Limitations.......................................................................... 1405Trusted/Untrusted Language.............................................................................................. 1405Enabling and Removing PL/Perl Support...........................................................................1406Developing Functions with PL/Perl.....................................................................................1406

    Greenplum MADlib Extension for Analytics.................................................................................. 1410About MADlib......................................................................................................................1410Installing MADlib................................................................................................................. 1410Upgrading MADlib...............................................................................................................1411Uninstalling MADlib.............................................................................................................1411Examples............................................................................................................................ 1412References..........................................................................................................................1418

    Greenplum Fuzzy String Match Extension....................................................................................1420Soundex Functions............................................................................................................. 1420Levenshtein Functions........................................................................................................1421

  • Contents Release Notes

    13

    Metaphone Functions......................................................................................................... 1421Double Metaphone Functions.............................................................................................1422Installing and Uninstalling the Fuzzy String Match Functions............................................ 1422

    Summary of Greenplum Features.................................................................................................1423Greenplum SQL Standard Conformance........................................................................... 1423Greenplum and PostgreSQL Compatibility.........................................................................1425

    Chapter 16: Greenplum Database UNIX Client Documentation........ 1434Greenplum Database Client Tools for UNIX.................................................................................1435

    Installing the Greenplum Client Tools................................................................................ 1435Client Tools Reference.......................................................................................................1438

    Greenplum Database Load Tools for UNIX..................................................................................1439Installing the Greenplum Load Tools................................................................................. 1439Load Tools Reference........................................................................................................ 1440

    Chapter 17: Greenplum Database Windows Client Documentation..1442Greenplum Database Client Tools for Windows...........................................................................1443

    Installing the Greenplum Client Tools................................................................................ 1443Running the Greenplum Client Tools.................................................................................1446Client Tools Reference.......................................................................................................1447

    Greenplum Database Load Tools for Windows............................................................................ 1449Installing Greenplum Loader.............................................................................................. 1449Running Greenplum Loader............................................................................................... 1451Running gpfdist as a Windows Service..............................................................................1455Loader Program Reference................................................................................................ 1456

    Chapter 18: DataDirect ODBC Drivers for Pivotal Greenplum...........1457Prerequisites.................................................................................................................................. 1458Supported Client Platforms........................................................................................................... 1459Installing on Linux Systems.......................................................................................................... 1460

    Configuring the Driver on Linux......................................................................................... 1461Testing the Driver Connection on Linux.............................................................................1462

    Installing on Windows Systems.................................................................................................... 1463Verifying the Version on Windows..................................................................................... 1463Configuring and Testing the Driver on Windows................................................................1463

    DataDirect Driver Documentation..................................................................................................1465

    Chapter 19: DataDirect JDBC Driver for Pivotal Greenplum............. 1466Prerequisites.................................................................................................................................. 1467Downloading the DataDirect JDBC Driver.................................................................................... 1468Obtaining Version Details for the Driver....................................................................................... 1469Usage Information......................................................................................................................... 1470DataDirect Driver Documentation..................................................................................................1471

  • Pivotal Greenplum 5.0.0 Release Notes Release Notes

    14

    Chapter 2

    Pivotal Greenplum 5.0.0 Release Notes

    Updated: September, 2017

  • Pivotal Greenplum 5.0.0 Release Notes Release Notes

    15

    Welcome to Pivotal Greenplum 5.0.0Pivotal Greenplum is a massively parallel processing (MPP) database server that supports next generationdata warehousing and large-scale analytics processing. By automatically partitioning data and runningparallel queries, it allows a cluster of servers to operate as a single database supercomputer performingtens or hundreds times faster than a traditional database. It supports SQL, MapReduce parallel processing,and data volumes ranging from hundreds of gigabytes, to hundreds of terabytes.

    Pivotal Greenplum 5.0.0 is a major new release, and is the first Pivotal Greenplum release based on theopen source Greenplum Database project code. Pivotal Greenplum 5.0.0 includes many new features andproduct changes as compared to prior releases.

    Pivotal Greenplum 5.0.0 software is available for download from Pivotal Network.

    Important: Pivotal Greenplum 5.0.0 is not yet certified for running on DCA systems. Contact yourDCA representative for information about the availability of Greenplum Database 5.0.0 support onthe DCA.

    Important: Pivotal Support does not provide support for open source versions of GreenplumDatabase. Only Pivotal Greenplum is supported by Pivotal Support.

    http://greenplum.org/https://network.pivotal.io/products

  • Pivotal Greenplum 5.0.0 Release Notes Release Notes

    16

    New FeaturesPivotal Greenplum 5.0.0 is based on the open source Greenplum Database project code, and includesthese new features.

    • PostgreSQL Core Features• Python 2.7• gpdbrestore Support for CASTs• Enhanced Session State Monitoring• Python Data Science Module Package• R Data Science Library Package• COPY Command ON SEGMENT Clause

    PostgreSQL Core FeaturesPivotal Greenplum 5.0.0 incorporates several new features from PostgreSQL 8.3 and later.

    • Heap Data Checksums• New Datatype Support• Improved XML Datatype Support• Anonymous Blocks• dblink Module• hstore Data Type and Functions• Cached Plan Invalidation• Ordering Results with NULL Values• Transaction ID Functions• Additional PostgreSQL Features

    Heap Data ChecksumsPivotal Greenplum 5.0.0 now includes a checksum feature to detect corruption in the I/O system for heapstorage. Checksums are computed when data pages are flushed to disk and verified when pages are re-read from storage. If checksum verification fails, the data page is not allowed to be read back into memory.Checksums are calculated for all heap pages, in all databases, including pages that store heap tables,system catalogs, indexes, and database metadata.

    Note that Greenplum Database append-only storage has its own built-in checksum protection separatefrom the feature described here.

    In Greenplum Database, heap checksums are enabled by default. Disabling checksums is stronglydiscouraged, but it can be done by setting the HEAP_CHECKSUMS parameter to off in the clusterconfiguration file supplied to the gpinitsystem management utility. Once a Greenplum Database systemhas been initialized, the heap checksums setting cannot be changed without reinitializing the system. Tochange the checksums setting, you must dump the databases, reinitialize the cluster, and then restore thedatabases. See gpinitsystem for information about setting HEAP_CHECKSUMS in the cluster configurationfile.

    You can determine if heap checksums are enabled for a Greenplum Database cluster by checking theread-only data_checksums server configuration parameter:

    $ gpconfig -s data_checksums

    If a checksum verification fails, the page is not read into memory, so no transaction can read or write to it.An error is generated and the transaction is aborted.

    http://greenplum.org/

  • Pivotal Greenplum 5.0.0 Release Notes Release Notes

    17

    A new system configuration parameter, ignore_checksum_failure, can be used to alter the system'sbehavior when a checksum verification fails. When this parameter is set to on, a failed checksumverification generates a warning, but the page is read into memory. If a transaction writes to the page and itis subsequently flushed to storage, corrupted data can be propagated to the mirror.

    Warning: Because of the potential for data loss, the ignore_checksum_failure parametershould only be enabled when necessary to recover after data corruption has been detected.

    New Datatype SupportGreenplum Database adds support for the built-in datatypes:

    • UUID—Universally Unique Identifiers (RFC 4122, ISO/IEC 9834-8:2005)• JSON—Variable, unlimited length JSON data. Greenplum Database also includes new built-in functions

    for supporting the JSON datatype. See Working with JSON Data.

    You can now create enumerated datatypes using the CREATE TYPE name AS ENUM( 'label' [, ... ] ) syntax. See CREATE TYPE.

    Greenplum Database now supports arrays of arbitrary, complex compound types. See "Pseudo-Types" inGreenplum Database Data Types.

    Hashing is supported for the NUMERIC datatype.

    Improved XML Datatype SupportXML datatype built-in functions and SQL commands from PostgreSQL 9.1 are now included. The newfunctions and function-like expressions include:

    • cursor_to_xml• cursor_to_xmlschema• database_to_xml• database_to_xmlschema• database_to_xml_and_xmlschema• query_to_xml• query_to_xml_and_xmlschema• query_to_xmlschema• schema_to_xml• schema_to_xmlschema• schema_to_xml_and_xmlschema• table_to_xml• table_to_xmlschema• table_to_xml_and_xmlschema• XMLCONCAT• XMLELEMENT• XMLFOREST• XMLPI• XMLROOT• XMLPARSE• XMLSERIALIZE

    The command SET XML OPTION { DOCUMENT | CONTENT } is supported to control XML datavalidation. The command is equivalent to setting the server configuration parameter xmloption

    XML data is validated as an XML fragment. In Greenplum Database 4.3, XML data is validated as anXML document. The value for the server configuration parameter xmloption controls how XML data isvalidated. The default value for xmloption is content.

  • Pivotal Greenplum 5.0.0 Release Notes Release Notes

    18

    Note: With the default XML OPTION setting, you cannot directly cast a character string to type xmlif the string contains a document type declaration. The definition of XML content fragment does notallow a document type declaration. To cast such as string, either use the XMLPARSE function orchange the XML option.

    See Working with XML Data.

    Anonymous BlocksAnonymous blocks are now supported for Greenplum Database procedural languages (PostgreSQL 9.0feature). See the DO command reference.

    dblink ModuleThe dblink module is provided for making easy connections to other databases either on the samedatabase host, or on a remote host. Greenplum Database provides dblink support for database usersto perform short ad hoc queries in other databases. dblink is not intended as a replacement for externaltables or for administrative tools such as gptransfer. In this release, dblink has several limitations:

    • The dblink_send_query(), dblink_is_busy(), and dblink_get_result() functions are notsupported.

    • Statements that modify table data cannot use named or implicit dblink connections. Instead, you mustprovide the connection string directly in the dblink function for such queries.

    See dblink Functions for basic information about using dblink to query other databases. See dblink in thePostgreSQL documentation for more information about individual functions.

    hstore Data Type and FunctionsThe hstore module is provided, which implements a data type and associated functions for storing sets of(key,value) pairs within a single Greenplum Database data field. See hstore Functions for more informationabout installing and using this optional module.

    Cached Plan InvalidationGreenplum Database 5.0.0 invalidates cached query plans when any of the relations referenced by theplan are dropped or altered. An individual cached query plan is invalidated when:

    • a definition changes in any of the relations on which the plan depends• any user-defined functions used in the plan are modified• statistics are updated for any table on which the plan depends (ANALYZE)

    Additionally, all cached plans are invalidated when schemas, operators, or operator classes are modified.

    The query is replanned if and when the next demand for the cached plan occurs.

    Ordering Results with NULL ValuesThe SELECT command ORDER BY clause now supports ordering null values first or last. See the SELECTcommand reference.

    Transaction ID FunctionsNew built-in functions are provided to return the transaction ID used by the current session. See thetxid_functions described at System Information Functions in the PostgreSQL documentation.

    Additional PostgreSQL FeaturesGreenplum Database 5.0.0 also includes these features from PostgreSQL 8.3 and later:

    ../utility_guide/dblink.htmlhttps://www.postgresql.org/docs/8.3/static/dblink.htmlhttps://www.postgresql.org/docs/8.3/static/functions-info.html

  • Pivotal Greenplum 5.0.0 Release Notes Release Notes

    19

    • Heap-only tuples (HOT) support improves the performance of updating non-indexed columns on tablesthat have indexes.

    • Support for setting configuration parameters, cost, and default and variadic arguments on a per-functionbasis.

    • Support for setting modifier input and output functions a per-type basis.• Support for the CREATE/ALTER/DROP EXTENSION commands (PostgreSQL 9.1 feature).

    Python 2.7Pivotal Greenplum 5.0.0 upgrades the installed Python version to version 2.7. PL/Python and core Pythonmanagement utilities are now based on version 2.7.

    gpdbrestore Support for CASTsgpdbrestore now supports restoring CASTs when using table and schema filters and the --change-schema flag. In previous releases, any restore that filtered based on a table or schema name did notrestore user-created CASTs because the CASTs were not included in the filter.

    Enhanced Session State MonitoringThe Greenplum Database session_state.session_level_memory_consumption view has been enhanced toinclude session idle time information. This information can be used from applications such as GreenplumDatabase Workload Manager and user-defined scripts to determine how long a session has been idle.

    To use the enhanced view, you must run SQL scripts that are provided in the Greenplum Databaseinstallation contrib directory, $GPHOME/share/postgresql/contrib:

    1. (Optional). For each database in which the view was previously registered, you must uninstall theolder view. Run the uninstall_gp_session_state.sql SQL script to drop the view and relateddatabase objects. This example uninstalls the view from a database named testdb.

    psql -d testdb -f $GPHOME/share/postgresql/contrib/uninstall_gp_session_state.sql

    2. For all databases in which you want to register the new view, run the $GPHOME/share/postgresql/contrib/gp_session_state.sql script to create the new view definition and related databaseobjects. This example installs the view in a database named testdb.

    psql -d testdb -f $GPHOME/share/postgresql/contrib/gp_session_state.sql

    For additional information about Workload Manager, refer to the Pivotal Greenplum Command Center andWorkload Manager documentation.

    Python Data Science Module PackageGreenplum Database now includes a Python Data Science Module package that you can optionally install.This package includes a set of commonly-used, open source data science Python modules. The PythonData Science Module package is available for download in .gppkg format from Pivotal Network. Thispackage includes the following Python modules:

    Table 1: Python Data Science Modules

    Module Name Version

    Beautiful Soup 4.6.0

    Gensim 2.2.0

    http://gpcc.docs.pivotal.io/latest/gpcc/welcome.htmlhttp://gpcc.docs.pivotal.io/latest/gpcc/welcome.htmlhttps://network.pivotal.io/products

  • Pivotal Greenplum 5.0.0 Release Notes Release Notes

    20

    Module Name Version

    Keras 2.0.6

    Lifelines 0.11.1

    lxml 3.8.0

    NLTK 3.2.4

    NumPy 1.13.1

    Pandas 0.20.3

    Pattern-en 2.6

    pyLDAvis 2.1.1

    PyMC3 3.1

    scikit-learn 0.18.2

    SciPy 0.19.1

    spaCy 1.8.2

    StatsModels 0.8.0

    Tensorflow 1.1.0

    XGBoost 0.6a2

    See the Python Data Science Module Package installation guide for additional information about thispackage.

    R Data Science Library PackageGreenplum Database now includes an R Data Science Library package that you can optionally install.This package includes a set of commonly-used, open source data science R libraries. The R Data ScienceLibrary package is available for download in .gppkg format from Pivotal Network.

    Refer to the R Data Science Library Package installation guide for additional information about thispackage, includ