Searching the data lake

Once a data is cataloged within the data lake, it is automatically indexed by the data lake search engine. This enables users to search and browse the packages available within the data lake and select data of interest to consume in a way that meets your business needs.

Search for data of interest

  1. In the provided text field, enter a search term .
  2. Select Search to submit your query to the search engine. Screenshot
  3. Select the Edit icon next to the result of interest to open and review the package to determine if it fits your business needs.

Wildcard searches

The data lake supports wildcard searches with the '*' character.

Example 1: To list all the packages in the data lake, enter * in the search box and click Search. This wildcard search would return a list of all the packages in the data lake limited to the number of packages configured to be returned at a time.

Example 2: To search for all packages that contained words that started with test*, enter test* in the search box and click Search. This wildcard search would return a list of all the packages in the data lake with a title, description or metadata that contained words starting with test* [ limited to the number of packages configured to be returned at a time ].

Screenshot

AWS Glue metadata searches

Every time the AWS Glue crawler crawls your pacakges, the index service is called to populate the search index with AWS Glue catalog information, such as column names, column comments, and table description.

This allow you to search for packages that contain datasets with information about trips. The image bellow illustrates this type of search:

Screenshot

You can also search for column_comment and table_desc.

See Also

Working with packages

Working with my cart