Introduction

An automated testing tool and dashboard for datasets available through Data Retriever

This django project serves as a status server and dashboard for the Retriever recipes from the Data Retriever platform. The information about each data package recipe can be found in the Data Retriever recipe’s docs Maintainers and users can check the status of datasets packages(retriever-recipes) i.e. whether the datasets are installing properly or not and the changes that have been made to these datasets. Check out the retrieverdash source code

The dashboard

The dashboard is built using the Django web framework for Python. The Dashboard shows the status of each dataset’s installation. The status for each dataset is stored in the dataset_details.json file which is generated by the dashboard_script.py module. Apart from the status of the dataset, retrieverdash also stores the last known state of the data. If there has been a change in the data recently, it produces “diffs” that inform about the changes that have happened in the dataset.

Features of Data Retriever Dashboard

The Data Retriever Dashboard performs a number of tasks including:
  1. Runs a script periodically that checks each and every dataset by installing it.
  2. Finds the changes in subsequent versions of datasets.
  3. Displays details of datasets and the changes on the dashboard.
  4. Also checks if spatial datasets can be installed into Postgres according to the Data Retriever guidelines for spatial datasets.

Running the dashboard locally

Required packages

To run Data Retriever Dashboard locally from source, you’ll need Python 3.4+ with the following packages installed:

  • django
  • retriever
  • django-crontab
  • pytest-django
  • psycopg2-binary

Steps to run the dashboard from source

  1. Clone the repository.
  2. From the directory containing manage.py, run the following command: pip install -r requirements.txt to install the requirements for the dashboard.
  3. Change the password and other details in the status_dashboard_tools.py file.
  4. python manage.py crontab add to add the cron job for running the script that would check the installation of datasets.
  5. python manage.py runserver to start the server for the dashboard.
  6. Open a browser and load the url 127.0.0.1:8000 . This is the dashboard.

Note

Initially you won’t see anything on the dashboard because the script has been set to run on every Sunday at 12:00 AM. To run it immediately go to the directory where manage.py is and run the command python manage.py crontab show. Now copy the hash of the cron from here. Now write the command python manage.py crontab run hash_of_the_cron. Now the script will run immediately. Open another terminal and start the dashboard server. The dashboard will start displaying the details now.

Acknowledgments

This project was developed by Apoorva Pandey as part of Google Summer of Code 2018.