How to access EODATA using boto3 on WEkEO Elasticity

In this article you will learn how to access EODATA repository using Python library called boto3, running on Linux or Windows virtual machine within WEkEO Elasticity cloud.

What Are We Going To Cover

  • Installing boto3

  • How to execute scripts found in this article

  • Browsing EODATA

  • Downloading a single file from the EODATA repository

Prerequisites

No. 1 Account

You need a WEkEO Elasticity hosting account with access to the Horizon interface: https://horizon.cloudferro.com.

No. 2 A virtual machine

You need a virtual machine running on WEkEO Elasticity cloud. This article is written for Ubuntu 22.04 and for Windows Server 2022.

Other operating systems might also work, but they are outside of scope of this article and might require adjusting of commands provided here.

Either way, your virtual machine needs to have access to the network which gives access to the EODATA repository. This network is either called eodata or has a name which starts with eodata_.

Linux VM

You can create a Linux virtual machine by following one of these articles:

Windows VM

To learn how to create a Windows virtual machine, see this article: How to create Windows VM on OpenStack Horizon and access it via web console on WEkEO Elasticity

If you are using the web console to access your Windows virtual machine, you can open the article you are currently reading in a web browser (like Microsoft Edge) installed on that virtual machine and copy the Python code to your chosen text editor.

No. 3 Python

You need Python installed on your virtual machine.

If you are using Linux, this article can help: How to install Python virtualenv or virtualenvwrapper on WEkEO Elasticity

And on Windows, you can follow this article: How to install Python in Windows on WEkEO Elasticity

No. 4 Obtained access and secret key

To access EODATA, you need to obtain your access and secret key. You can do it by following this article: How to get credentials used for accessing EODATA on a cloud VM on WEkEO Elasticity

No. 5 Basic knowledge about Python

boto3 is a Python library so you have to know your way around Python.

Installing boto3

Follow appropriate procedures on installing boto3:

Installing boto3 on Linux

If you are using Python environment like virtualenv, enter the environment in which you wish to install boto3. In it, execute the following command:

pip3 install boto3

You can also install the package globally:

sudo apt install python3-boto3

Installing boto3 on Windows

Follow this article to install boto3 on Windows: How to Install Boto3 in Windows on WEkEO Elasticity

How to execute scripts found in this article

The method of executing the scripts is different, depending on the operating system of your choice.

How to execute scripts using Linux command line

Open a text editor of your choice like nano or vim. Paste the script. Perform appropriate modifications to the code as instructed (like assigning values to variables). Save the file.

Once you have exited from the text editor, execute the python3 command followed by the name of your script from the directory it is in. For example:

python3 browse.py

The script should be executed.

How to execute scripts using Windows command prompt

Open a plain text editor (like Notepad). Paste the script. Perform appropriate modifications to the code as instructed (like assigning values to variables). Save the file with .py extension (make sure that Windows does not add .txt extension on top of it).

Open the command prompt (cmd.exe). Navigate to the directory in which the script is located using cd command, for example:

cd C:\Users\John\scripts

Execute the script using the python command followed by its name, for example:

python browse.py

Browsing EODATA

You can use boto3 to browse the EODATA repository. The code which allows you to achieve this goal can be found in this section.

Variable name

What should be assigned to it

access_key

Your access key. Obtain it by following Prerequisite No. 4.

secret_key

Your secret key. Obtain it by following Prerequisite No. 4.

directory

The directory within EODATA repository which you want to explore.

When filling in the variable directory, make sure to follow these rules:

  • Use slashes / as separators between elements of that path - directories and files

  • Do not start the path with a slash /

  • Since the element you are exploring is a directory, finish the path with a slash /

  • Start path with folder name found within the root directory of the EODATA repository (for example Sentinel-2 or Sentinel-5P)

If you want to explore the root directory of the EODATA repository, assign an empty string to variable directory:

directory=''

If you don’t have a directory which you want to explore but you want to simply test this method, you can leave the value which was assigned to variable directory in the example code below.

Variables host and container contain the EODATA endpoint and the name of the container used, respectively. You do not need to modify them.

import boto3

access_key='YOUR_ACCESS_KEY'
secret_key='YOUR_SECRET_KEY'
directory='Envisat-ASAR/ASAR/ASA_WSS_1P/2012/04/08/'

host='http://data.cloudferro.com'
container='DIAS'

s3=boto3.client('s3',aws_access_key_id=access_key, aws_secret_access_key=secret_key,endpoint_url=host)

print(s3.list_objects(Delimiter='/',Bucket=container,Prefix=directory,MaxKeys=30000)['CommonPrefixes'])

If you provided your access and secret keys but did not modify the variable directory, the code above will list products found in Envisat-ASAR/ASAR/ASA_WSS_1P/2012/04/08/ directory of the EODATA repository. In that case, the output should look like this:

[{'Prefix': 'Envisat-ASAR/ASAR/ASA_WSS_1P/2012/04/08/ASA_WSS_1PNESA20120408_110329_000000603113_00267_52867_0000.N1/'}, {'Prefix': 'Envisat-ASAR/ASAR/ASA_WSS_1P/2012/04/08/ASA_WSS_1PNESA20120408_110428_000000603113_00267_52867_0000.N1/'}, {'Prefix': 'Envisat-ASAR/ASAR/ASA_WSS_1P/2012/04/08/ASA_WSS_1PNESA20120408_110446_000000603113_00267_52867_0000.N1/'}]

This output can be described as a “list of dictionaries”. Each of those dictionaries contains a key called Prefix, providing the path to a file or directory. Instead of printing this list like above, you can loop through it to increase the legibility of the output:

import boto3

access_key='YOUR_ACCESS_KEY'
secret_key='YOUR_SECRET_KEY'
directory='Envisat-ASAR/ASAR/ASA_WSS_1P/2012/04/08/'

host='http://data.cloudferro.com'
container='DIAS'

s3=boto3.client('s3',aws_access_key_id=access_key, aws_secret_access_key=secret_key,endpoint_url=host)

for i in s3.list_objects(Delimiter='/',Bucket=container,Prefix=directory,MaxKeys=30000)['CommonPrefixes']:
    print(i['Prefix'])

This time, the output should show only the paths:

Envisat-ASAR/ASAR/ASA_WSS_1P/2012/04/08/ASA_WSS_1PNESA20120408_110329_000000603113_00267_52867_0000.N1/
Envisat-ASAR/ASAR/ASA_WSS_1P/2012/04/08/ASA_WSS_1PNESA20120408_110428_000000603113_00267_52867_0000.N1/
Envisat-ASAR/ASAR/ASA_WSS_1P/2012/04/08/ASA_WSS_1PNESA20120408_110446_000000603113_00267_52867_0000.N1/

Downloading a single file from the EODATA repository

This section covers how to download a file from the EODATA repository.

The script below should download that file to a directory from which the script is being executed. If that directory already contains a file which has the same name as the one you are downloading, it will be overwritten without prompt for confirmation.

In code below, replace the following variables:

Variable name

What should be assigned to it

access_key

Your access key. Obtain it by following Prerequisite No. 4.

secret_key

Your secret key. Obtain it by following Prerequisite No. 4.

key

Full path (including folders) of a file you want to download from EODATA repository.

When filling in variable key, make sure to follow these rules:

  • Use slashes / as separators between elements of that path - directories and files

  • Do not start or finish the path with slash /

  • Start path with the name of the folder found within the root directory of the EODATA repository (for example Sentinel-2 or Sentinel-5P)

If you don’t have a file which you want to download but you simply want to test this method of downloading files, you can leave the value which was assigned to variable key in example code below.

Again, variable host and container contain the EODATA endpoint and the name of the container being used, respectively. You do not need to modify them.

import boto3

access_key='YOUR_ACCESS_KEY'
secret_key='YOUR_SECRET_KEY'
key='Landsat-5/TM/L1T/2011/11/16/LS05_RMPS_TM__GTC_1P_20111116T100042_20111116T100111_147386_0194_0035_4BF1/LS05_RMPS_TM__GTC_1P_20111116T100042_20111116T100111_147386_0194_0035_4BF1.BP.PNG'

host='http://data.cloudferro.com'
container='DIAS'

s3=boto3.resource('s3',aws_access_key_id=access_key,
aws_secret_access_key=secret_key, endpoint_url=host,)

bucket=s3.Bucket(container)

filename=key.split("/")[-1]

bucket.download_file(key, filename)

If provided your access key and secret key but you did not change the contents of variable key, the code should download the file called

LS05_RMPS_TM__GTC_1P_20111116T100042_20111116T100111_147386_0194_0035_4BF1.BP.PNG

which is located within the root directory of product

LS05_RMPS_TM__GTC_1P_20111116T100042_20111116T100111_147386_0194_0035_4BF1

After executing the script, the output should be empty. Regardless, the downloaded file should be visible within the directory from which the script was executed. For example, this is how it will look like on Linux:

../_images/access-eodata-boto3-03_creodias.png

What To Do Next

You can further modify these scripts so that they better suit your needs, or integrate them with your own applications. These scripts might also work in other development environments. This is outside of scope of this article.

boto3 can also be used to access object storage containers from WEkEO Elasticity cloud: How to access object storage from WEkEO Elasticity using boto3