How it works

The S3 connector pulls in all documents from the specified Amazon S3 bucket. It supports various file formats including PDF, DOC, DOCX, TXT, and more.

Documents are updated every 1 day.

Setting up

Authorization

  1. Log into your AWS Management Console.
  2. Navigate to the IAM (Identity and Access Management) dashboard.
  3. In the left sidebar, click on “Users” and then “Create user”.

  1. Set a name for the new user (e.g., “OnyxS3Connector”) and click “Next”.
  2. Click “Attach policies directly” and search for “AmazonS3ReadOnlyAccess” or a similarly permissive policy.
  3. Select this policy and click “Next”.
  4. Add any tags if needed, then click “Create user”.
  5. You’ll now be on the users page. Click on the user you just created.
  6. Click “create access key” and select “Third-party service”
  7. Select “I understand the above recommendation and want to proceed to create an access key” and then “Next”.
  8. Optionally, set a description tag and then press “create access key”.
  9. You should now see, you’ll see the Access Key ID and Secret Access Key. Make sure to copy these immediately as you won’t be able to access the Secret Access Key again.

Indexing

  1. Navigate to the Admin Dashboard and select the S3 Connector.
  2. In Step 1, provide your AWS credentials:

  • If AWS Access Key ID and AWS Secret Access Key are provided, they will be used for authenticating the connector.
  • Otherwise, the Profile Name will be used (if provided).
  • If no credentials are provided, then the connector will try to authenticate with any default AWS credentials available.

AWS Access Key ID: AWS Secret Access Key: Profile Name:

  1. Click “Update” to save your credentials.

  2. In Step 2, specify which S3 bucket you want to make searchable:

  1. Click “Connect” to begin indexing.

You should now see a page like below and be able to add more buckets with the same credentials!

Understanding S3 Structure

Amazon S3 organizes data into buckets. Each bucket can contain an unlimited number of objects (files). You can think of a bucket as a root directory, and the objects as files within that directory.

For more information on S3 structure, visit the Amazon S3 documentation.