Working with the Settings
You can customize the data lake settings to the needs of your business. This topic describes each of the settings.
Overview of General Settings
The following describes the general administration settings of your data lake. To access the data lake settings, select Settings under the Administration section in the left navigation pane.
- Application Url: The URL for the data lake console.
- Default Amazon S3 Bucket: The Amazon S3 bucket used to store datasets and manifests that are uploaded to the data lake. Additionally, manifest files created when users "checkout" their carts are stored in the default Amazon S3 bucket.
- Amazon Elasticsearch Index: The index created in the Amazon Elasticsearch cluster for packages in the data lake.
- Amazon Elasticsearch Url: The URL for the data lake Amazon Elasticsearch cluster.
- Amazon Elasticsearch Kibana Url: The URL for the Kibana application in the data lake Amazon Elasticsearch cluster.
- Cognito User Pool Id: The unique identifier for the data lake Cognito User Pool.
- Audit Logging: Enable or disable audit logging for the data lake. When audit logging is enabled, all user operations within the data lake are logged to the datalake/audit-log in Amazon CloudWatch logs.
- Default Search Results Limit: The upper bound limit for the number of hits returned when performing a search in data lake.
- Default Manifest Expiration Period: The length of time in seconds that manifest files generated in a user's cart are valid. This value can be between 15 minutes (900 seconds) and 4 hours (14400 seconds).
Enable audit logging
- In the navigation pane, under the Administration section, select Settings.
- Under the General tab, check Enable access logging.
- Select Save to update the data lake settings.
Disable audit logging
- In the navigation pane, under the Administration section, select Settings.
- Under the General tab, uncheck Enable access logging.
- Select Save to update the data lake settings.
Modify the default search results limit
- In the navigation pane, under the Administration section, select Settings.
- Under the General tab, update the value of the Default Search Results Limit setting.
- Select Save to update the data lake settings.
Modify the default manifest expiration period
- In the navigation pane, under the Administration section, select Settings.
- Under the General tab, update the value of the Default Manifest Expiration Period setting.
- Select Save to update the data lake settings.
Governance Settings
Often times, a business wants to ensure datasets are added to their data repositories with specific business contextual information to help data consumers easily identify the context of the data. To help facilitate these requirements, data lake administrators can create simple governance policies to require specific tags when datasets are registered with the data lake. For example, a business may require that every dataset registered with the data lake must have a Category tag to identify the classification of the data. Additionally, the business want users to optionally define the Source of the data when registering a dataset. The following shows how the governance settings for this scenario would be defined in the data lake.
Add a governance setting
- In the navigation pane, under the Administration section, select Settings.
- Under the Governance tab, select Add Tag Governance.
- Enter a Tag Name and select the Governance for whether the tag is Required or Optional.
-
Select Save to update the data lake governance settings.
Remove a governance setting
- In the navigation pane, under the Administration section, select Settings.
- Under the Governance tab, select the X on the right side of the governance setting you want to remove.
-
Select Save to update the data lake governance settings.