What is the Data Lake Command Line Interface

The Data Lake Command Line Interface (CLI) is a tool to manage your Data Lake. By downloading and configuring the Data Lake CLI, you can control multiple aspects of your Data Lake from the command line and automate them through scripts.

Supported Commands

For a list of the available commands you can use with Data Lake Command Line Interface, see Available Commands in the AWS CLI Command Reference.

Getting Set Up with the Data Lake Command Line Interface

Before you can start using the Data Lake Command Line Interface, you must have a Data Lake account (if you don't already have one) and set up your CLI environment.

Note: The Data Lake CLI makes API calls to services over HTTPS. Outbound connections on TCP port 443 must be enabled in order to perform calls.

Getting your Data Lake API Endpoint, Data Lake access key and secret access key

CLI credentials consist of an access key and secret access key, which are used to sign programmatic requests that you make to the Data Lake. If you don't have access keys, you can request your Data Lake administrator to grant you API access by using the Data Lake Console.

  1. Open the Data Lake console.
  2. In the navigation pane, choose Profile under the My Account menu.
  3. Your Data Lake API Endpoint and Access Key are located under the API Access section of your profile.
  4. To generate a Secret Access Key for your account, click on the Generate button under the API Access section of your profile. Once a Secret Access Key has been generated for your account, it will be displayed for a one time download opportunity in the API Credentials pop-up. Your credentials will look something like this:

    • Data Lake API Endpoint: samplek19ruh.execute-api.us-east-1.amazonaws.com
    • Data Lake Access Key: SJxiA_EXAMPLEKEY
    • Data Lake Secret Access Key: f10e347df150638393502dEXAMPLEKEY

Installing the Data Lake Command Line Interface (Linux or OS X)

Follow these steps from the command line to install the Data Lake CLI on Linux or OS X.

To install the Data Lake CLI

Prerequisites

First, check to see if you already have Node.js installed:

node --version

Installing the Data Lake CLI

  1. Download the Data Lake CLI package using wget or curl
  2. Unzip the package.
  3. Run the install executable.

On Linux and OS X, here are the three commands that correspond to each step:

$ curl "https://s3.amazonaws.com/solutions-reference/data-lake-solution/latest/datalake-cli-bundle.zip" -o "datalake-cli-bundle.zip"
$ unzip datalake-cli-bundle.zip -d datalake-cli-bundle
$ sudo ./datalake-cli-bundle/install.sh

Test the Data Lake CLI Installation

Confirm that the CLI is installed correctly by viewing the help file. Open a terminal, shell or command prompt, enter datalake help and press Enter:

$ datalake help

Where to Go from Here

Once you have the Data Lake CLI installed, you should configure it for use with your Data Lake.