QuARG is a Python client that allows network operators to generate quality assurance (QA) reports from start to finish. These reports utilize IRIS’s database of MUSTANG data quality metrics to find and highlight potential issues in the data, reducing the amount of time that analysts need to spend scanning the data for problems.
Users have the ability to customize QuARG to adapt to their particular network. Some features that can be personalized:
This utility guides users through the process of generating a list of potential issues, examining and tracking issues, and generating a report.
QuARG can be installed on Linux and macOS.
IRIS (Incorporated Research Institutions for Seismology) DMC (Data Management Center) has been performing quality assurance checks on data since the Transportable Array began in 2004. Since that time, we have expanded and improved our quality assurance efforts, including developing a comprehensive quality assurance system called MUSTANG with over 40 metrics available through our webservices.
In addition to the weekly QA performed on the TA network, we developed monthly quality assurance reports for the _GSN virtual network. Since then, we added a few more networks to our monthly and quarterly network reports as we refined our methods and improved the QuARG utility. We wrapped up our final network report in Summer 2019 with the goal of providing the QuARG utility to individual networks for the purpose of performing their own QA. While this tool was born at the DMC, intended for generating reports on very specific networks and leveraging tools that we have available in Seattle, the utility has since been expanded to be useful to network operators working on their own networks.
Over the years, we have refined the process of generating a report into four primary steps:
The QuARG utility follows this same workflow to help guide users through the process from start to finish.
In case you are new to this or need a refresher, here is a review of a few terms:
Metric
“Metric” is a term that is used to describe some quality or aspect of the data. This can range from quite simple - say, the number of gaps in a day or the maximum value recorded in a day. It can also be more complex, such as taking the cross correlation of two channels at a station and recording the maximum correlation value. Metrics do not inherently tell us whether the data is “good” or “bad” - they simply tell us something about the data. It is up to us, as Network Operators or Researchers, to use those metrics to decide what data is good or and what data bad. And remember, what is “bad” for one purpose may be very “good” for another, and vice versa.
For both MUSTANG and ISPAQ, metrics are calculated for each N.S.L.C. target, and generally are calculated on a per-day basis. There are a few exceptions, such as the metrics that are based on the data from a window surrounding the start time of an earthquake.
The usage of the term “Metric” can get complicated and confusing. Generally, the term refers to a concept or algorithm, such as the metric “num_gaps”. A value is calculated for each metric-target-day. We tend to call these values “value” or “measurement”, hence accessing them via our measurements webservice. So IU.ANMO.01.BHZ has a sample_median value of 6241 counts for 2020-01-01 - here, the “metric” is sample_median and the “value” (or measurement) is 6241.
It is suggested that you familiarize yourself with the metrics that MUSTANG uses. A full list with a brief description can be found here - click on the red Current list of all metrics
button to pull down the full list. Clicking on the Detailed Documentation
button for any given metric will take you to a page that describes it more fully.
Threshold
Thresholds take a metric from a simple description of the data and moves it into a way to determine if the data is “good”. We use the term Threshold to mean a Metric
plut a Cutoff Value
… or, in many cases, a combination of metrics and cutoff values. For QuARG, we actually focus on the data that is “bad”, since Network Operators need to know where things are going wrong so that they can fix it.
One thing to keep in mind is that different instrument types - broadband, short period, strong motion, etc - may have different cutoff values. For example, strong motion instruments have a very different noise profile than broadband instruments, and even a healthy strong motion instrument will have a significant portion of the noise profile that is above the New High Noise Model (Peterson, J, 1993, Observations and Modeling of Seismic Background Noise, U.S.G.S. OFR-93-322) . So thresholds using pct_above_nhnm probably ought to have different cutoff values if applied to strong motion and broadband.
It should be noted that in QuARG, some Thresholds are based on metadata as well. This can help you find cases where the metadata may be incorrect or incomplete.
Here are some examples of thresholds:
pct_above_nhnm > 90 && dead_channel_lin > 2
Dip != 0 [for horizontal channels]
num_spikes >= 500
percent_availability = 0
num_gaps / percent_availability >= 12
[average] num_gaps > 2
You will notice that there are a variety of types of thresholds in those examples: there are simple cases in the form of METRIC [operator] VALUE. There are also METRIC1 / METRIC2 [operator] VALUE. There are also thresholds that that an AVERAGE of a METRIC over the reporting period and compare that to a VALUE. There are some thresholds that apply only to certain subsets, such as only to the Horizontal or only to the Vertical channels. There are many possible ways to define a Threshold, and the Threshold Definitions Form
allows you to do all of these types of comparisons, plus a couple more. This may make the thresholds a little more complicated, but we think that it is worth it to have greater flexibility for you, the user.
IMPORTANT NOTE: The Thresholds, and particularly the cutoff values, that come with QuARG are ones that we have found empirically to balance between false positives and false negatives. They are not set in stone, and will very likely benefit from refinement based on your own network. We have made it so that you can edit, add, or remove thresholds based on your own needs.
MUSTANG
MUSTANG is the Quality Assurance system that we have built at IRIS. It is essentially an entire workflow that ingests data from our archives and outputs a range of about 45 metrics. When data comes into the IRIS DMC, whether in realtime or latent, it triggers a series of steps that lead to metric calculation on that data. The UTC day after data is archived, MUSTANG will begin calculating metrics on the data. Note that archiving can be up to about a day after realtime data streams in, due to the way that the data is pooled prior to archiving.
We store the metrics we have calculated in a series of databases that are accessible to users through our web services. Most of the metrics are accessed through the measurements service, though there are also a handful of other services that are primarily related to PSDs and PDFs.
If you are unfamiliar with our web services, in simple terms it is a way to input a specific URL into your webbrowser and have the requested metric values returned to you. Or, you can use your favorite language (python in the case of QuARG) to do the work for you.
ISPAQ
Because MUSTANG is inherently built into the IRIS DMC, and we know that not all data streams into our archive, we have created a portable version of MUSTANG that users can install on their own computer to run metrics on their local data. This utility, ISPAQ, is a command line python tool that can write metrics to a file system or to a sqlite database (in ISPAQ 3.0, to be released soon). Those ISPAQ metrics that are written to a sqlite database can be accessed by QuARG by specifying the Metric Source
from within the Preference File Form
. This allows greater flexibility - networks can still use QuARG to find issues in their network even if the data does not get archived at the IRIS DMC and we do not have MUSTANG metrics for that data.
QuARG is distributed through GitHub, via IRIS’s public repository (iris-edu). You will use a git
client command to get a copy of the latest stable release. In addition, you will use the miniconda
python package manager to create a customized Python environment designed to run QuARG properly.
If running macOS, Xcode command line tools should be installed. Check for existence and install if missing:
xcode-select --install
Follow the steps below to begin running QuARG.
You must first have git
installed your system. This is a commonly used source code management system and serves well as a mode of software distribution as it is easy to capture updates. See the Git Home Page to begin installation of git before proceeding further.
After you have git installed, you will download the QuARG distribution into a directory of your choosing from GitHub by opening a text terminal and typing:
git clone https://github.com/iris-edu/quarg.git
This will produce a copy of this code distribution in the directory you have chosen. When new quarg versions become available, you can update QuARG by typing:
cd quarg
git pull origin main
Anaconda is quickly becoming the defacto package manager for scientific applications written python or R. Miniconda is a trimmed down version of Anaconda that contains the bare necessities without loading a large list of data science packages up front. With miniconda, you can set up a custom python environment with just the packages you need to run QuARG.
Proceed to the Miniconda web site to find the installer for your operating system before proceeding with the instructions below. If you can run conda
from the command line, then you know you have it successfully installed.
By setting up a conda virtual environment, we assure that our QuARG installation is entirely separate from any other installed software.
You will go into the quarg directory that you created with git, update miniconda, then create an environment specially for quarg. You have to activate
the QuARG environment whenever you perform installs, updates, or run QuARG.
cd quarg
conda update conda
conda create --name quarg -c conda-forge --file quarg-conda-install.txt
conda activate quarg
See what is installed in our (quarg) environment with:
conda list
Every time you use QuARG, make sure that you conda activate quarg
to ensure that it will run smoothly.
To run QuARG, simply activate the conda environment and run it as a python script:
(quarg) bash-3.2$ python QuARG.py
Especially on Linux machines, if QuARG still isn’t working, make sure you have the OpenGL and mesa-dri libraries installed.
If it hasn’t been emphasized enough yet, QuARG facilitates a workflow through which an analyst creates a final report.
Within QuARG, there are a number of working parts that interact with each other to move the user through this process. The goal of this section is to provide an overview of how each of those cogs interact with each other within the larger machine, so that you might have a better understanding of which processes and files rely on each other.
Elsewhere in the documentation, we spend considerable time detailing each of the processes and how to use them. Here, we gloss over the specifics and simply introduce the parts to you.
QuARG is a program to walk you through the four steps of report generation (Finding Issues, Examining Issues, Creating Tickets, Generating Report), and each of those four steps interacts in some way with a file (or files) external to the program. These files are created, edited, used, and passed along to the next step in one way or another as the analysts transforms a list of potential issues into a definitive Quality Assurance Report.
In the figure below, the top bar represents four of QuARG’s tabs. The row just below that are sub-processes that can happen in some tabs, while the bottom row represents some of the most important files that QuARG generates (ie, not the temporary files that are created and destroyed). The arrows indicate whether the QuARG process writes to or reads from the external file.To edit the Preference file (preference_file.py), you use the Edit Preference form - this will read from the existing preference file (if there is one) and write out to a new or existing file when you hit Save Preference File
. Similarly, use Edit Thresholds to edit the thresholds.txt file. This will read from the existing file and write any changes to it when Save Thresholds
is pushed.
Save
button.
Note that Examine Issues reads from and writes to the Issue File. As you work through the list of issues, marking their status (To-Do, New, Existing, False Positive, etc) or adding notes to help you track your progress and understanding, the progress that you save is being written to the Issue File. This is all done to help you keep track of your work, but anything added here will not be included in the final report. It is a separate process to create a ticket, and tickets are the basis fo the final report.
The Create Ticket form can be found from either the Examine Issues Pane, or from the View Tickets Tab. When you fill out the form and Submit
the ticket, it is then added to the db/quargTickets.db database.
Once there are tickets in the database, they can be retrieved and viewed using the See Tickets button in the View Tickets Tab.
See Tickets
. This process will retrieve all of the information about the ticket from the database and display it. From there, you can edit any of the fields and when you hit Submit
it will write back to the database and overwrite the existing ticket. Delete
will delete that ticket from the database.
This section has the most moving parts working in conjunction to convert the tickets in the database into a final report.
If Use QuARG Tickets for Report
is selected, then it will first translate the tickets in the database into a csv text file. If that button is not selected, then the csv file will not be created and it will need to be supplied by the user - this is so that users who use an external ticketing system can continue to use that other system, and they just need to export their desired tickets into a csv file of the correct format. More information on the details of the file are in the Ticket File
section.
With the CSV file in place, QuARG then translates that file into the final report in .html format.
This means that a) if you change a ticket without regenerating the CSV File, then those changes will not be reflected in the final report. b) changes made to the CSV file do not affect the tickets in the database.
Now that you have a very broad understanding of how QuARG works, the following delves into much greater detail about every step of the process.
Every time you use QuARG you must ensure that you are running in the proper Anaconda environment. If you followed the instructions above you only need to type:
cd quarg
conda activate quarg
after which your prompt should begin with (quarg)
. To run QuARG, you use the QuARG.py
python script that lives in the quarg directory. The example below shows how to get the QuARG GUI to start up. A leading ./
is used to indicate that the script is in the current directory.
(quarg) bash-3.2$ ./QuARG.py
or
(quarg) bash-3.2$ python QuARG.py
When you open the utility, you will see something that looks like this:
There are four tabs across the top and each tab is used for a specific part of the report process:
Find Issues is where you will generate a list of potential issues by combining MUSTANG metrics (Threholds) together. From this tab, you can access two other important pages:
Find Issues
Examine Issues provides tools for processing the list of potential issues and diagnosing possible causes. You can also create tickets here.
View Tickets is where you can create, view, edit, and delete tickets.
Generate Report uses the tickets that a user specifies to generate a network report.
Within each tab there are a number of fields, most of them are common to all tabs.
First is the Preference File - this is the location where the program will find the preference file on the filesystem. You can type the pathway manually, or use the Browse
button next to the Preference File field to navigate to and select the file. An example file preference_file_IRIS.py is provided in the base quarg directory, so when you use the Browse
button you should find that file along with all of the other QuARG-related files.
Next to the Browse
button is an Autofill
button - once a file has been selected and the path is displayed in the Preference File field, this button will load the preference file and fill in the other fields. When this is done within one tab (say, Find Issues) it will also fill in the fields of another tab (Examine Issues, View Tickets, and Generate Report). More detail about the Preference File later.
If you do not want to use the values from the preference file, you can either opt to not use Autofill, or you can simply change the values in the fields you would like to change after you have Autofilled from the preference file. Any value that you input into one of the fields will overwrite the corresponding value from the preference file. If you change the Directory
field, then you will likely want the new directory to be used in all parts of QuARG. To propagate that change through to the other tabs, use the Apply Changes
button - after this is done, the new directory should be listed in all of the tabs that have the field.
At the bottom of each screen is a large Exit
button. There is an exit button somewhere along the bottom of every screen in QuARG, and when pushed it will prompt a popup asking if you really want to exit the program. In addition to this Exit
button, there are many screens that will also have a Return
button, which will return you to the previous screen. All other buttons will be described in their corresponding sections below.
This is where you specify the targets and time range you are interested in for the report, which QuARG then uses to generate a list of potential issues.
Find Issues is where you will generate a list of potential issues by combining MUSTANG metric values (threholds) together.
There are 8 fields that need to be filled in for this section. As described previously, you can use your preference file to fill in the 7 remaining fields. The easiest way to do this is to use the Browse
button, navigate to the appropriate preference file, load it, and then use the Autofill
button to fill in that Start Date
, End Date
, Issue File
, Network
, Stations
, Locations
, and Channels
.
When all fields are filled, the Find Issues button will begin the process of grabbing MUSTANG metrics, combining them based on the defined thresholds, and write all potential issues to a text file (described earlier). If there happens to already be an file of the same name, the utility will ask if you want to overwrite the previous version or cancel the action.
To actually perform this task, the python script findIssues.py is used. This script also relies on thresholds.py and reportUtils.py. Each of these will be described in more detail later.
The preference file is a place to define values that will be used for reports each time a report is generated. This is to reduce the amount that the analyst needs to input each time, since they can set up a preference file once and use that for each subsequent report.
To edit the preference file, use the Edit Preferences
button on the Find Issues
tab of QuARG. That button will take you to a form that allows you to create or edit your preference file(s).
In general, the preference file is broken into a handful of broad categories:
Each of these are fields that can and should be adapted for your particular network and report process. Note: If you have already loaded a Preference File on the Main Screen, then that file will automatically populate all of the fields in this form.
Very quickly, here is a table with all of the fields in the preference file:
Directories and Files | Targets | Thresholds | Header | Frequency |
---|---|---|---|---|
Filename | Network(s) | Threshold Groups | Author | Report Frequency |
Working Directory | Stations(s) | Group-Thresholds | Project Name | |
Issues File | Location(s) | |||
Ticket File | Channel(s) | |||
Metric Source | H, V | |||
Metadata Source | Instrument Groups | |||
Below is a description of each of the fields.
There are a handful of directories and filenames that you can choose to edit. These come with some default values that will work fine for most users, but there may be a case where you want to change the name of some of the files that will be produced throughout the entire report generation process.
Filename: This is file where the preferences will be written to. Point this to an existing file if you want to edit the file, or a new one if you want to create a new Preference File. To do this, you can either type in the name of the file or use the Browse
button to nagivate to and choose the file.
Note: By hitting the Load File
button, you can populate all of the other fields based on the values from the specified Filename, if it is already an existing Preference File.
Note: If you have already loaded a Preference File on the Main Screen, then that file will automatically populate all of the fields in this form.
Working Directory: The directory where the report subdirectories (see section on Report Frequency) will be written. Each report, with its associated files like the Issue File
and Ticket File
as well as the final report, will be written to a new subdirectory based on the Network and date. This Working Directory
is where these subdirectories will be created. It defaults to the current directory, which contains the QuARG.py program.
Issue File: This file is where all potential issues are written to during Find Issues
and it is the file that is loaded and edited when you Examine Issues
. The Issue File
field in the Preference File should be just the filename, omitting any directories, since the file will be written to the subdirectory dictated by the Working Directory
/Network
/date
. Behind the scenes, this file is a ‘|’-separated text file that contains the fields: # Threshold|Target|Start|End|Ndays|Status|Value|Notes
… though you shouldn’t have to do anything with the file itself because QuARG will load it up into a much more human-readable and manipulatable form for you.
Generate Report
, the report relies on a file that contains the information about all of the included tickets. This is where you specify that file.
Generate Report
will first query the local sqlite database for tickets that match your selection criteria and write them out to this file before using the file to create the report.# Threshold|Target|Start|End|Ndays|Status|Value|Notes
. If that’s the case, then the file generated using the external ticketing system must match this filename. Metric Source: QuARG can retrieve metric values from either the IRIS MUSTANG web services or from a local sqlite database. ISPAQ is a portable version of MUSTANG that can be downloaded from GitHub and used to calculated metrics on your local machine from local data. A new feature in ISPAQ is that it can now write to a sqlite database, which can then easily be read by QuARG. If a local database will be used, use the Metric Source
dropdown menu to select “Local ISPAQ SQLite Database` and then either browse to or type in the name of the database file to be used. Otherwise,”IRIS" should be selected from the dropdown menu.
Metadata Source
dropdown menu to select “Local Metadata File” and either use the Browse
button or type in the name of the file containing the metadata. Otherwise, “IRIS” should be selected in the dropdown menu.#Network | Station | Location | Channel | Latitude | Longitude | Elevation | Depth | Azimuth | Dip | SensorDescription | Scale | ScaleFreq | ScaleUnits | SampleRate | StartTime | EndTime
IU|ADK|00|BH1|51.8823|-176.6842|129.0|1.0|1.0|0.0|Streckeisen STS-6A VBB Seismometer|1.98475E9|0.02|m/s|40.0|2018-09-05T02:00:00|
IU|ADK|00|BH2|51.8823|-176.6842|129.0|1.0|91.0|0.0|Streckeisen STS-6A VBB Seismometer|1.98475E9|0.02|m/s|40.0|2018-09-05T02:00:00|
IU|ADK|00|BHZ|51.8823|-176.6842|129.0|1.0|0.0|-90.0|Streckeisen STS-6A VBB Seismometer|1.98475E9|0.02|m/s|40.0|2018-09-05T02:00:00|
IU|ADK|00|HH1|51.8823|-176.6842|129.0|1.0|1.0|0.0|Streckeisen STS-6A VBB Seismometer|1.98474E9|0.02|m/s|100.0|2018-09-05T02:00:00|
IU|ADK|00|HH2|51.8823|-176.6842|129.0|1.0|91.0|0.0|Streckeisen STS-6A VBB Seismometer|1.98474E9|0.02|m/s|100.0|2018-09-05T02:00:00|
IU|ADK|00|HHZ|51.8823|-176.6842|129.0|1.0|0.0|-90.0|Streckeisen STS-6A VBB Seismometer|1.98474E9|0.02|m/s|100.0|2018-09-05T02:00:00|
IU|ADK|10|BH1|51.8823|-176.6842|130.0|0.0|2.0|0.0|Trillium 240 broad band|2.0231E9|0.02|m/s|40.0|2011-09-21T06:18:00|
IU|ADK|10|BH2|51.8823|-176.6842|130.0|0.0|92.0|0.0|Trillium 240 broad band|2.02339E9|0.02|m/s|40.0|2011-09-21T06:18:00|
IU|ADK|10|BHZ|51.8823|-176.6842|130.0|0.0|0.0|-90.0|Trillium 240 broad band|1.99378E9|0.02|m/s|40.0|2011-09-21T06:18:00|
IU|ADK|10|HH1|51.8823|-176.6842|130.0|0.0|2.0|0.0|Trillium 240 broad band|2.0231E9|0.02|m/s|100.0|2011-09-21T06:18:00|
IU|ADK|10|HH2|51.8823|-176.6842|130.0|0.0|92.0|0.0|Trillium 240 broad band|2.02338E9|0.02|m/s|100.0|2011-09-21T06:18:00|
IU|ADK|10|HHZ|51.8823|-176.6842|130.0|0.0|0.0|-90.0|Trillium 240 broad band|1.99377E9|0.02|m/s|100.0|2011-09-21T06:18:00|
IU|ADK|60|BH1|51.8823|-176.6842|130.0|0.0|2.0|0.0|Streckeisen STS-1VBB w/E300|8.63252E8|0.02|m/s|20.0|2018-09-05T02:00:00|
IU|ADK|60|BH2|51.8823|-176.6842|130.0|0.0|92.0|0.0|Streckeisen STS-1VBB w/E300|8.65013E8|0.02|m/s|20.0|2018-09-05T02:00:00|
IU|ADK|60|BHZ|51.8823|-176.6842|130.0|0.0|0.0|-90.0|Streckeisen STS-1VBB w/E300|1.07024E9|0.02|m/s|20.0|2018-09-05T02:00:00|`
This section defines which targets (network, station, channel, location) will be used when retrieving quality assurance (likely from MUSTANG, but could also be ISPAQ) metrics and metadata, and therefore which channels will be included in the issue list. It also defines the category of instrumentation used in the report.
As one of the first steps in the reporting process, QuARG uses the targets listed in the Preference File to go out to the MUSTANG web services (or ISPAQ) and gather all potentially useful metric values. This means that if the fields defined here are very broad, QuARG may try to retrieve a significant volume of metric values and potentially take a long time. After all measurements have been gathered up, then QuARG will subsequently use the targets defined in each of the selected Instrument Groups (see below) to find issues using the defined Thresholds. This means that there are two potential limiting factors affecting the targets to be used in a report: 1) Those allowed by the Preference File’s Network, Station, Location, and Channel fields, and 2) those allowed by the Instrument Groups Network, Station, Location, and Channel Fields. While there is no right or wrong way to do the target filtering, one way that is suggested is to leave the Instrument Groups more broadly defined and then to be more specific within the Preference File. This way, the exact same Thresholds can be used even if you generate multiple Preference Files that each have different targets, say different network codes or different instrument types depending on your particular needs.
The following fields are fairly self-explanatory. Each of them can contain multiple values as a comma-separated list, and can include wildcards such as "*" or “?”:
IU
*
or ANMO, ADK
*
or 01
BH?,HH?,EH?,SH?
or BHZ
Related are a couple of fields that are used when performing the metric combinations for the thresholds.
Channel Codes: There are two fields here, and they define which codes are associated with the horizontals and vertical channels. For example, if the network contains 3-component stations that are labelled as E,N, and Z then we would defined H: E,N
and V: Z
. But if the network uses 1, 2, and 3 then we would define H: 1,2
and V: 3
- or whatever your particular network does. These fields are included to allow networks to explicitly define what coordinate system they use in case there is variation. The channels defined here are what will be used when executing thresholds that have specifically dictated that H[orizontal] or V[ertical] channels should be used in that threshold (see Thresholds Definition Form
)
Instrument Groups: Using the Threshold Definitions Form
Instrument Group Popup, you can create instrument groupings that makes it easy to tailor threshold definitions depending on the instrument. For example, broadband instruments have a different noise profile (PSDs or PDFs) than short period instruments, and therefore the thresholds relating to the noise levels ought to be different for broadbands and short periods. Using the Threshold Definitions Form
, you can add or remove as many of these instrument groups as you want. Here in the Preference File Form
you can select which of those instrument groups you are interested in using to Find Issues
. To include a group, simply select it from the list. Any that are selected (blue) will be included in the Preference File. For example, you may have threshold definitions for broadband, short period, and strong motion instruments, but perhaps you want to use this preference file for looking at only the short period instruments. You can use this field to dictate that.
While the thresholds themselves are defined using the Thresholds Definition Form
, this is where you get to decide which thresholds should actually be run. There are two ways to do this: 1) by associating Thresholds with certain Threshold Groups (or associating with no Group, which would mean that threshold is not run), or 2) by toggling on or off entire Threshold Groups altogether.
Threshold Groups: QuARG comes with five threshold groups to begin with. These can be though of as general categories: Amplitudes, Completeness, Metadata, State of Health, and Timing. Using the Thresholds Definition Form
, you can add or remove groups to this list depending on how you see fit for your usage. Here, within the Preference File Form
, select any or all from the list of available options to determine which groups should be used when finding issues - any selected Threshold Group will be blue, and all that are selected will be included in the Prefrence File. For example, if Metadata is deselected then none of the thresholds associated with Metadata will run.
Associating Threshold Groups with Thresholds: Each of the thresholds can be placed into one of the Threshold Group categories. Once a threshold is assoicated with a Threshold Group, it will run along with all of the other thresholds in that group (given that the Threshold Group is selected, see above). To do this, you must hit the Edit Groups Dictionary
button that will open a page that has a list of all unused (unassociated) thresholds, all of the Threshold Groups, and a list of all of the associations:
Column 1 - Thresholds
The first column here, on the left, contains a selectable list of all Thesholds that have not yet been assigned to a Threshold Group. Since each threshold can only be associated with a single Threshold Group, this list will shrink or grow depending on how you associate Thresholds with Threshold Groups.
Column 2 - Threshold Groups
A selectable list of all possible Threshold Groups. This reflects the Threshold Groups that are defined in the Thresholds Form
Column 3 - Associations
The third column contains a selectable list of all Threshold Group::Threshold associations, ordered alphabetically by Threshold Group, then by Threshold. This column is also dynamic and will grow or shrink depending on what associations you add or remove from the list.
To associate a Threshold with a Threshold Group: simply select one or more thresholds from column one, select the group that they should be associated with, and then hit the Add
button. When this happens, all selected Thresholds should now show up in the third column and be removed from the first column.
To remove an association: select one or more lines from the third column and hit the Remove
button. When this happens, any selected rows will be removed from the third column, and those thresholds will now be listed in the first.
Note: A threshold can only be associated with one Threshold Group. If a threshold is not associated with any group, it will never be run. This may be something you want to do if, say, your network doesn’t record timing quality in blockette 1001. If that is the case, then the poorTQual threshold will trigger on every single channel for every day – this creates a significant amount of noise to wade through. In that case, I would suggest not associating poorTQual with any Threshold Group, so it is never run.
These fields are used only for the final report header and contain information about who generated the report, the scope of what the report covers, and contact information. Of all of the fields in the preference file these are perhaps the least important, though they must still be filled out.
In the Preference File Form
there is one field relating to the report frequency: the aptly named Report Frequency
.
To specify the Report Frequency, select one of the options from the dropdown menu.
This field is used to do behind the scenes date and directory-naming calculations:
First, it will select the date range associated with the previous period, which will be used to fill in the Start Date
and End Date
fields along the top banner of QuARG.
Note: The dates listed in those text boxes are what will be used throughout QuARG (for finding issues, etc) and once loaded from the preference file, those dates can be manipulated within those text boxes. You can change these dates manually, if you wish to cover a different timerange than is autofilled.
Second, it will set up the directory structure where all of the intermediate files and the final report will be held, as it creates a new subdirectory for each report. This directory will incorporate the start date of the reporting period so that it is easy to navigate through the reports.
Note: If you change the Start and End dates manually (see #1 above) then you may also want to change the name of the directory to match the dates that you are actually using for the report. If you change the name of the directory manually, then you will want to hit the Apply Changes
button to ensure that new directory is used throughout QuARG.
There are currently four frequency options:
Start Date: 2020-01-01
and End Date: 2020-04-01
.Working Directory
/Network
/YYYYMM/, where YYYYMM are the year and month that begins the previous quarter. Again, if you run QuARG between April 1st and Jun 30th, 2020, YYYYMM will be 202001.Start Date: 2020-04-01
and End Date: 2020-05-01
.Working Directory
/Network
/YYYYMM/, so that a report started on May 5, 2020 would create sudirectory of 202004/Start Date: 2020-08-03
and End Date: 2020-08-10
.Start Date: 2020-08-11
and End Date: 2020-08-12
Remember, both the Start/End dates and the subdirectory can be changed manually. If you change the dates, you may want to change the directory name to match.
Notes: Depending on the resources available, and the size of the network, it can be easier to keep up with Monthly or Quarterly reports. For a longer period, any given report will take longer, but we have found one Quarterly report to be less time than 3 Monthly reports. But less frequent reports can mean that issues take longer to be diagnosed and remedied. Therefore, any given network will need to strike a balance to meet its own needs.
Also, given the time required to produce a full report, the daily option is not recommended. If it is used, it may be wise to have a preference file that runs only a small handful of thresholds, for example if it runs only the availability thresholds. Otherwise, it may take too long to produce a daily report.
In many ways, Thresholds are the entire basis of of QuARG and these Quality Assurance (QA) Reports. They are a way to take pre-computed MUSTANG or ISPAQ metric values and use those metrics as a way to find potential issues in the data. The Thresholds File is what QuARG uses to keep track of Instrument Groups (see Preference File Form
) and Threshold Groups, as well as actually defining the thresholds. To edit this file, you use the Threshold Definitions Form
. This file is thresholds.txt
and is necessary for QuARG to Find Issues
, which creates the file that is used to Examine Issues
.
In case you need a refresher, some defintions are listed here
The Threshold Definitions Form is where users can edit threshold definitions. It has all of the tools you will need to add, remove, or edit the Thresholds used in QuARG.
NOTE: it is very important that you save all changes before hitting the Return
button and returning to the Main Page
To access the Thresholds Definitions Form
, hit the Edit Thresholds
button from the Find Issues Tab
of the main page.
That button will take you to another form where you can add, update, or remove thresholds. Upon entry, you should see:
There are two important and interrelated components that need to be defined in this form:
Because different types of instrumentation can have different qualities about them, often times thresholds that work for one type of instrument wont make sense for another. Because of this, we allow you to create threshold definitions based on instrument group. For example, QuARG comes with a threshold called “flat” - this threshold will look for any cases where broadbands have a sample_unique < 200… but since short period instruments tend to have a smaller range, the “flat” threshold will trigger on short periods that have sample_unique < 50. If we used the cutoff value of 200 for the short period instruments, then it would trigger on far too many healthy data, leaving the analyst to swim through too many false positives and wasting time.
So, in this form you can create and edit Thresholds (names), and you can also create and edit Instrument Groups. These Thresholds (names) and Instruments Groups can then be associated with each other by setting the definition of that Threshold with that Instrument Group. The definition is where you choose the metrics (or metadata) and cut off values that will be used to find problems.
This form has a lot going on, so here is a breakdown:
1st Column
In More Detail:
This column consists entirely of Threshold names. It contains a scrollable, selectable list of existing Threshold names. From the example above, you can see the Threshold “flat” on the list. When a Threshold
is selected, it will turn blue. If you would like to add or remove a Threshold from the list, then at the bottom of the list is a button that says Add/Remove Thresholds
. Hit that button and it will spark a popup that will allow you to do just that. More on the Thresholds Popup below.
The Threshold
selected in this column will be used in conjunction with the Instrument Group
from the 2nd column as you define the Threshold.
2nd Column
In More Detail:
The 2nd column contains three general parts…
Instrument Groups
At top is a scrollable, selectable list of existing Instrument Groups. Instrument groups, which are discussed more here, are used as a way to divide your network into a variety of groups. We have already mentioned the need to divide based on instrumentation type - such as broadband, short period, and strong motion. But there may also be a use in dividing the network spatially, especially if the network spans a wide range of environments. For example, the noise level of a healthy station in a very quiet location would be expected to be quite different than a healthy station in a noisy location, such as on the coast. Using the same threshold for both could potentially A) trigger the healthy, yet noisy, coastal station too frequently, or B) let an actually problem at the quiet site go unseen because the cutoff value is too great. The way that any given Network divides up their instruments will depend strictly on their own needs.
To Add, Remove, or Edit an Instrument Group, then hit the Add/Update Instrument Groups
Button just below the list and it will produce a popup that will allow you to do that.
Threshold Groups
Below that is the Add/Update Theshold Groups
button. The Threshold Groups are the same ones that you see in the Preference File Form
- they are general categories that the Thresholds may fit into. For example, Completeness or Amplitudes. The Preference File will allow you to specify which Thresholds belong in which Threshold Group, so this is simply where you can add or remove a group from the list. When the button is clicked, it will produce a popup that will allow you to do that editing.
Definition Display
At the bottom is an area that will display the definition for the selected Threshold name and Instrument Group. If the white area is blank, then that means either there is no definition provided for that combination of Threshold and Instrument Group, or one (or both) of those fields has not been selected. When there is a definition displayed, each part of that definition is selectable - when selected, it will turn blue. This allows you to remove any part of the definition if you would like. To do that, simple hit the Delete
button below and it will remove all selected parts of the definition.
Due to the flexibility in types of definitions (average, ratios, etc), there is some notation used in the display that helps both the user and QuARG know how to interpret that particular definition. Here is a quick summary of what you may see, and what it means:
Display | Meaning |
---|---|
average :: METRIC > VALUE | Take the average of the values over the entire requested time period and then apply the cutoff value. Note: while you can usually combine multiple metrics/cutoff values in one threshold, this is not the case when average is used: it must be the only part in the definition |
median :: METRIC > VALUE | Take the median of the values over the entire requested time period and then apply the cutoff value. Note: while you can usually combine multiple metrics/cutoff values in one threshold, this is not the case when median is used: it must be the only part in the definition |
METRIC1 / METRIC2 > VALUE | Take the ratio of the first metric to the second metric, then compare to the cutoff value |
METRIC1 > METRIC2 | Do a direct comparison of the value of the first metric to the second metric |
… METRIC[H] … | Apply this to only the horizontal channels, do not use for verticals. Note: Horizontal channels are defined in the Preference File Form under H |
… METRIC[H:avg] … | Use an average the two horizontal channels. Note: Horizontal channels are defined in the Preference File Form under H |
… METRIC[H:vs] … | Compare the horizontal channels to each other. Note: Horizontal channels are defined in the Preference File Form under H |
… METRIC[V] … | Apply this to only the vertical channels, do not use for horizontals Note: Vertical channels are defined in the Preference File Form under V |
… abs(…) … | Take the absolute value before doing the comparison |
3rd Column
In More Detail:
Metrics
At the top is a selectable list of all of the MUSTANG metrics. This list comes from the IRIS MUSTANG webservices and is refreshed whenever QuARG is connected to the internet so it should stay up to date as we add new metrics. When a metric is selected, it will fill in the text box labeled Field
below. While you can simply type the metric you are interested in Field
box directly, the list makes it easy to know what metrics are availble to use.
Channel Options
The channel options allow you to specify whether a threshold, or part of a threshold, should apply to only the horizontal or vertical channels. In most cases, these will not be used since you will want to find issues associated with any and all of the channels. But there are some cases where you would want to limit things. For example, when looking for issues in the metadata you may want to find all cases where the horizontal channels have a Dip != 0. If you applied this threshold to all channels, then every vertical channel should get triggered since they ought to have a non-0 Dip. Another example would be rmsRatio, which compares the sample_rms of the vertical channel to and average of the horizontals.
There are 4 buttons for Channel Options
:
V
(verticals)H
(horizontals)avg
(average)vs
(versus)The latter two can only be used in conjunction with H
, and can only be used when either Ratio
or Direct Comparison
are selected (see below). When selected, the button will turn blue. See table above for more description of “average” and “vs.”
Defining the Threshold
This is where you specify the field, operator, and cutoff value to be used. The Field
will autofill from either the Metrics
list or the Metadata
list when an item from either of those is selected. But what matters is the text listed as the Field
here, not what item was selected from the list. So you can select, for example, num_gaps from the list, and it will autofill the Field
with num_gaps, but you can then go change the text to say num_overlaps. num_overlaps would then be the metric used in the threshold definition. Note: if the Metadata
toggle button is selected (see below) then the Field
must match an item in the Metadata
list. If the Metadata
toggle button is not selected, then the Field
must match an item from the Metrics
list.
The ABS
option will specify that this threshold should take the absolute value of the metric before doing the comparison.
Below that are the operator options: <
, =
, !=
, >
. Greater than or Less than can be used in conjunction with Equal to or Not Equal to.
And finally is the Value
, which is the cutoff value for the threshold and in most cases must be numeric (the exception is some metadata fields, such as ScaleUnits) .
4th Column
In More Detail:
Metadata List This is a scrollable, selectable list of all metadata fields that can be used in QuARG. These are based on the IRIS station service headers at the channel level in the text format. When a field is selected, it will turn blue and will automatically fill in the Field
in column 3. The metadata list is disabled by default, and only becomes available when the Metadata
toggle button is selected (see below).
Threshold Options
There are five options available:
Option | Description |
---|---|
Average |
Take the average value over the requested time span. While most threshold can include many parts in succession, average cannot. Cannot be used in conjunction with Metadata . |
Median |
Take the median value over the requested time span. While most threshold can include many parts in succession, median cannot. Cannot be used in conjunction with Metadata . |
Ratio |
Take a ratio of two metrics (or channels) and compare that ratio to a cutoff value. Enables the avg and vs options for the horizontal channels. This is a more complex threshold, see below for more description |
Direct Comparison |
Compare two metrics (or channels), without a cutoff value. Enables the avg and vs options for the horizontal channels. This is a more complex threshold, see below for more description |
Metadata |
Allows you to use a metadata field - enables the Metadata List (see above), and disables the Average and Median options |
Add Thresholds The Add Metric to Threshold Definition
button allows you to take all of the information you can provided (Field, Operator, Cutoff Value) and add it to the definition of that threshold. In order to do the add, all required fields must be provided (this depends on what type of threshold is being generated, see Complex Thresholds), as well as the Threshold name and Instrument Group both being selectd. When this button is pushed, given all fields are properly supplied, the definition will show up in the Definition Display
.
Save
IMPORTANT To save any and all changes made, you must push the Save
button. Otherwise, all work will be lost when returning to the Main Page.
Two of the threshold types are more complex: Direct Comparison and Ratio. This is because they must involve either two metrics or two channel types. Since you can only input one metric or channel type at once, it is a multi-step process to add these types of thresholds.
Adding a Direct Comparison threshold requires 3 steps. With the Direct Comparison
button selected:
Add Metric to Threshold Definition
.
Definition Display
, along with " :: compare" at the end.Add Metric to Threshold Definition
.
Definition Display
, along with " :: compare" at the end.Field
, and select the required operator (>
,!=
,=
, or <
). Hit Add Metric to Threshold Definition
.
Notes:
Field
must be empty. If not, it will warn you.Value
Adding a Ratio threshold requires 3 steps. With the Ratio
button selected:
Add Metric to Threshold Definition
.
Definition Display
, along with " :: ratio" at the end.Add Metric to Threshold Definition
.
Definition Display
, along with " :: ratio" at the end.Field
, and select the required operator (>
,!=
,=
, or <
). Provide a numeric cutoff Value
. Hit Add Metric to Threshold Definition
.
Notes:
Field
must be empty. If not, it will warn you.Value
and the operator must be supplied.When adding or removing a Threshold name from the list, you use the Thresholds Popup, via the Add/Remove Thresholds
button in the Threshold Definitions Form
. This looks like:
This is where you can add a threshold to or remove a threshold from the list of all thresholds.
Add Threshold
To add a new threshold, simply:
Now that threshold is available from the list of thresholds (on the left) when you return out of this popup. From there you can select the new Threshold and an Instrument Group, then supply a new definition.
Remove Threshold
To remove an existing threshold:
Note that if you delete a threshold, it also deletes all definitions (for all Instrument Groups) associated with it - so remove with caution. If you no longer want to use a threshold, it may be a better move to simply disassociate that threshold from any Threshold Group using the Preference File Editor. That way, the threshold and all definitions will still exist, but that threshold will never run.
Remember: to save changes made here, you must hit the Save Thresholds
button after returning out of this popup.
There are a variety of reasons why you may want to have more than one definition for a given Threshold. One reason would be that different types of instruments - for example, strong motion, short period, and broadband - intentionally have different characteristics. These differences manifest themselves in the metric values, and if the same cutoff value is used for different types of instrumentation then it is very likely that either a) issues will be missed in some of your instruments, or b) a lot of false positives will be flagged and overwelm your analyst. Similarly, if your network spans a variety of environments, then you may expect different behaviours based on the location of the station. A station near the ocean will be inherently noisier than one in a very quiet location, and either the thresholds will always flag the noisy stations or the thresholds will miss true issues that might pop up in the generally quiet locations.
To deal with this, we created the concept of an Instrument Group - which is just like it sounds, a group of instruments. Each Instrument Group can have a different definition for a given Threshold, if necessary. Or if the threshold simply doesn’t make sense for a particular instrument group, then it doesn’t have to have a definition associated with it.
This is where you can add, edit, or remove an Instrument Group.
Add Instrument Group
Group Name
fieldAdd/Update Group
button.Note: The targets specified here may be further constrained by the targets specified in the Preference File. For example, if you have an Instrument Group that includes channels “BH?,HH?” but the Preference File says to run only on BH? channels, then only BH? channels will be run.
Update Instrument Group
Group Name
field, or select it from the Existing Groups
dropdown menuNetworks
, Stations
, Locations
, and Channels
fields and can be changed from there. Otherwise supply the fields as you want them to be.Add/Update Group
buttonRemove Instrument Group
Group Name
field, or select it from the Existing Groups
dropdown menuDelete
button. QuARG will double check that you really want to delete the Instrument Group before deleting it.Note that if you delete an Instrument Group, it will delete all Threshold defintions associated with it and it will no longer be accessible for the Preference File.
Remember: to save changes made here, you must hit the Save Thresholds
button after returning out of this screen.
Threshold Groups can be thought of as general categories that the thresholds would fall under. By default, QuARG comes with these Threshold Groups:
Threshold Groups are used when QuARG executes Find Issues
. In the Preference File, you can determine which Thresholds are associated with which Threshold Groups, as well as choose which Threshold Groups to run. For each Threshold Group that is selected in the Preference File, all associated thresholds will run.
For example: This means that if “Amplitudes” contains avgSpikes, badResp, dcOffsets, dead, and flat, then all five of those thresholds will run as long as the Preference File indicates that Amplitudes should run. If Amplitudes is deselected in the Preference File, then none of those five thresholds will run.
This popup is where you can choose to add another group to the list, or remove an existing one. If you remove a Threshold Group here, it will still exist in any Preference Files that had been generated prior to the removal. To ensure that the group is truly removed, you will have to edit your Preference File to dissociate any Thresholds from that group. And to run those Thresholds, make sure to associate those Thresholds with another Threshold Group.
Add Threshold Group
Group Name
fieldAdd Group
buttonRemove Threshold Group
Group Name
field, or select it from the dropdown menu Existing Groups
Delete
button. It will double check that you truly want to delete the Group prior to actually deleting it.Remember: to save changes made here, you must hit the Save Thresholds
button after returning out of this screen.
There is little description needed for this popup - by clicking on the View Current Threshold Definitions
it will pop up a more human-readable list of all of the current threshold definitions. This means that you don’t need to click through each Threshold and Instrument Group combination to see what your definitions are. As soon as a change is made, it will be reflected in this popup. But remember: to save the changes for actual use, you must hit the Save Thresholds
button!
After the Issue File has been generated, using Find Issues
the next step is to sort through the discovered issues. Many will be true issues, but some will be false positives. It is important to sort through which are actually problematic, and try to determine what the cause might be. This will help in remedying the issue.
The bulk of the time and work that the analyst will spend on a network report will use this part of the program. This is the very original purpose of the very first version of QuARG - to keep track of all of the issues found using MUSTANG metrics. This part is where you can thumb through the (potential) issues station-by-station or by threshold. You can mark issues with different statuses to track what will need ticketing, what already has a ticket, and what can be ignored. Write notes about what you’ve found. And use a variety of webservices to help in diagnosing the potential issue.
We have found that it is useful to pull up a list of all open tickets when you start going through the issues for a new report. If the open ticket is still a problem and manifesting itself in the same way, then you can simply mark the issues as Existing and move on, spending less time on it. If it has changed in some way, then that would be good to note in the Ticket. And if the problem has been resolved then the ticket should be updated and closed. It is easiest to do this at the beginning, before adding any new open tickets to the mix. At this point, QuARG does not have the ability to easily display both the Issues Pane and the Ticket List at the same time - perhaps in the future it will be able to - so to facilitate this part of the process, it might be easiest to open two instances of QuARG… one to have a list of existing tickets open, the other to have a list of current issues. This is not ideal, but is certainly easier than flipping back and forth between See Tickets
and Examine Issues
.
Very important note: any actions you take in the following screen will not affect actual tickets. They are simply for the analyst’s own recordkeeping. Changes will be saved to the Issue File
. To create, edit, or delete an actual ticket is done in the ticketing screen. Tickets are what ultimately go into a report.
When the user lands on the Examine Issue Tab, it is a very simple interface. Just like The Find Issues Tab, there are fields for Preference File and Issue File. In this case, the Preference File is optional: only the Directory and Filename are required.
If you have already Autofilled from preference file, all fields should already be filled in. Otherwise, you can use the Browse
button to navigate to the preference file, load it, and Autofill
. Alternately, you can enter the path to the Issue File directly or use the Browse
button to navigate to it and load it.
The most important field here is the Issue File - the rest of the Examine Issues capabilities relies on reading in and writing to that file. If the field is not filled in, then there will be a message on the command prompt informing you that it cannot open the (missing) file.
When the Issue File field is properly filled, then use the Examine Issues
button near the bottom to load up a new screen.
When you land on the Examine Issues Screen, it will look something like this (except it probably won’t be empty):
All issues should show up in the Issue Pane when you first enter. If it was unable to load the file for any reason, it’ll give you a warning and produce an empty list. In addition, dates will autofill along the left side, based on the dates loaded from the Preference File (if applicable), and the default metric of num_gaps should show in the Metrics field. As you move through the issues, which are grouped by station, the Station field should update to indicate the current station.
There is a lot going on in this Examine Issues screen, so here we break it down section-by-section:
The Issue Pane will display information about the potential issues that were found. It consists of 8 columns:
Add Notes
button)Note: There are some thresholds that are an average or median over the entire report period. For these, the Start and End fields will be the start and end of the requested timeframe.
If there are many issues to display, the Issue Pane is scrollable to see all issues. Each row is selectable by clicking on it and multiple rows can be selected at once. To perform many actions, such as Add Notes
or changing the status, you need to select the row that you wish to affect. More information about actions below.
The Inputs section contains fields that the analyst can write in and alter. How exactly these fields are used can vary, but the most common use for them is to be used as inputs into the web services (described in the Diagnosis Tools section below). They are also used in navigating through issues, such as specifying which targets or thresholds you want to view.
Metrics
and Metric Plot
buttons to view values for the specified metricsGo To Threshold
button, will pull up all issues that match the provided thresholdGo To Target
Go To Target
; updates automatically as the analyst scrolls through issues station-by-stationGo To Target
Go To Target
For each of the target fields above (Network
, Station
, Location
, and Channel
), wildcards or comma-separated lists are accepted. Required fields are dependent on the web service being used. For example, Noise Modes
calls on the Noise Mode Timeseries webservice, which requires a single target to be specified without wildcarding. In those cases where the analyst is interested in a target with an empty location code (’‘), they should try using’–’ instead.
The navigation buttons allow the user to pull up issues in different ways.
Next
and Previous
will move through the issues station-by-station in alphabetical order.
To Do
and All Remaining
are ways to look at issues based on their Status. To Do
will only display those issues that are still marked as TODO as the user navigates through them using the Next
and Previous
buttons. All Remaining
will pull up one large list of all issues that are still marked as TODO.
Go To Target
and Go To Threshold
will take the inputs provided in the Inputs Pane and use those to pull up all matching issues. For Go To Target
, it will use any information provided in the Network
, Station
, Location
, and Channel
fields, wildcarding any that are left blank (ignoring the threshold information). Go To Threshold
will look for any thresholds matching that provided in the Threshold
input field and pull up all that match (ignoring the target information).
All Issues
will pull up a list of every issue, all in one view.
As described earlier, there are two fields from the Issue Pane
that can be altered to help the analyst keep notes as they process through the issue list. These fields are Status
and Notes
.
Each row in the Issue Pane
is selectable, and multiple can be selected at once. By using the Select All
or Deselect All
buttons, you can either select all of the issues displayed on the screen, or deselect any that may have been selected. To change the status or notes of a line, first ensure that it is selected. Then:
Notes To add notes, write any notes in the Notes
text box and click the Add Notes
button. The notes will be immediately update for the selected lines. Be aware that the Issue Pane
will truncate the text for display only - the full text is still available. To see the full text, use the See Notes
button, and it will display a popup with the notes of any selected lines.
Status There are a set number of status options, which roughly align with those available in the Ticketing System. Along the top there are seven possible statuses - the exact meaning can depend on the analysts use, but they are generally self-descriptive:
Status | Description |
---|---|
To-Do |
Marks the status as “TODO”, which is the default status. |
New |
For issues that have a new ticket created |
Closed |
For issues that have tickets that are now closed out because the issue has been fixed |
Existing |
For issues that have tickets that already exist and the issue persists |
Support |
For those issues that are good to note but might not actually be a problem or shouldn’t be included in a report and therefore have a support ticket in the Ticketing System - we often have use this for issues such as strong diurnal signals that can cause false-positives for some of the thresholds, by having a support ticket that describes what we see there is less time spent investigating the issue for each report |
No Ticket |
For things that might be real signals but are not worth a ticket |
False Positive |
For times when the thresholds triggered when there was truly nothing wrong - this is useful for figuring out what thresholds might benefit from definition changes. To change a status, simply click on the desired button and any selected rows will immediately update. |
IMPORTANT NOTE: It is VERY important to note that any changes that you make here, whether adding notes or changing a status, is NOT immediately saved. To save your work, hit the Save
button at the bottom of the screen. We suggest doing this often so that work is not lost if you accidentally close out of the program.
The diagnosis tools are a series of links to various web services that have proven useful in identifying and diagnosing the potential issues. Except for Databrowser
, they all take input form the Inputs section described above.
Tool | Description |
---|---|
Databrowser | Databrowser is a tool that allows users to plot MUSTANG metrics. These include Metric Timeseries (plotting metric values over time), Gap Duration plots, Network and Station boxplots, as well as some other options. It can be useful in looking at a network’s overall health, or to quickly view patterns in metric values over long periods of time. The Databrowser button does not require any of the Input fields to be filled. |
Waveforms | This button will retreive and display waveform data from the IRIS timeseriesplot service. This requires all target fields to be specified, though it can accomodate a comma-separated list. Users must be careful with the requested Start and End times, as the service limits the length of time that can be plotted. Note: this returns a static image and is not recommended to be the primary way of viewing waveforms - we expect the analyst to use another more dynamic tool to view waveforms, this is simply for use as a quick view of the data. |
Metrics | The Metrics button opens a web browser page that displays metric values from the MUSTANG Measurements web service. It uses input from all of the input fields except for Threshold . Start and End are used to limit the time range for the metrics retrieved; Metrics can be a comma-separated list of any desired metrics; Network , Station , Location , and Channel can all be wildcarded, lists, or left blank. Be careful of leaving fields blank, particularly Network , as that can create a very large query. |
Metric Plot | The Metric Plot button uses the same inputs as the Metrics button, but rather than opening a web page with tabular data, it generates a simple timeseries plot of the requested values. |
PDFs | Opens a webpage with monthly PDFs for the requested targets, beginning with the month of Start . |
Spectrograms | Opens a webpage with the spectorams for the requested targets, for the time span of Start to End . If no dates are provided, will do for the entire span of the targets (from the beginning of the earliest target until the end of the latest target). |
Noise Modes | Opens a webpage to the Noise Mode Timeseries plot. All Network , Station , Location , and Channel fields must be filled, with only one target allowed (ie, no wildcarding or lists). Will use the Start and End dates. |
GOAT | GOAT is a fairly simple tool that displays data continuity and gaps visually. This button will open a webpage that displays the plot for the target and dates specified. All target fields must be filled, and only one target is allowed (no wildcarding or lists). |
Events | Opens a webpage of the USGS event service based on the Start and End dates specified. It will list all earthquakes M5.5 and larger, as MUSTANG event-based metrics do not calculate on smaller events. |
Station | Opens a channel-level web page of the IRIS Station service, using provided target information. Any blank field will be wildcarded, and lists and wildcards are allowed; start and end times are ignored for this diagnosis tool. |
To help make it easier to a) remember what the definitions are for any given threshold, b) see what metrics are in which thresholds, and c) see a list of all possible metrics for your reference, there are a couple of buttons near the bottom that will provide this information.
Metric List
produces a popup that lists all of the metrics available from the MUSTANG web services. Unlike many things in QuARG, this list has fields that can be selected (via double click or click-and-drag of the cursor), copied, and pasted. That way you can copy a metric from the list and then paste it into the Metrics
field in the Inputs section above.
Threshold List
produces a popup of all defined thresholds so that you can easily see what metrics and cutoff values were used for any given threshold. This list also has the ability for copy and paste, so again you can select (double click) a metric from the theshold, then copy and paste it into the Metrics
field of the Inputs section above.
Described in much greater detail below, the Create Ticket
button will open a new screen where an actual ticket can be created. Remember, only tickets are used in the final report, so no matter what other notes and status changes you make as described above, what truly matters are what tickets are created/edited/closed.
To save your progress, hit the Save
button. It is recommended to do this frequently so that work is not lost in case the program closes unexpectedly. When the Save
button is hit, the Issue File
(see Output Files) is updated to reflect the current state.
Tickets are the backbone of the final report. You will be able to filter which tickets are included in the report, but any information - all descriptions, links, dates, images, etc - in the in the final report come directly from the tickets themselves. This means that you should provide as much information in the ticket as you want in the report. Depending on your particular use, it is worth thinking of these tickets as a way to convey the scope of the problem to the field crew so that they can adequately plan a way to fix the problem. What is the problem? (to the extent that you can determine), when did it begin?, what station and channels are affected?, what figures could be useful to explain the issue?. Then on top of that there is information that helps you, as the analyst, to track issues as they arise and are resolved and to track which thresholds and metrics are finding issues (this could be useful in tweaking threshold definitions, potentially).
Any real issue should have a ticket created for it. We have found that many networks want to know about and track issues that came and went within a reporting period, so as to have a complete picture of the network. If that is the case, then you may be creating Closed or Resolved tickets for those issues, in addition to creating New tickets for those issues that are still a problem. And of course, as the issue described in a New ticket is dealt with, you will want to transition it to In Progress, Resolved, or Closed. You are free to determine the exact use of the status options for your network, it’s just a good idea to be consistent.
This is a suggestion: At the beginning of every report, go through the open tickets to see if they are still problems. If so, you can leave that ticket open and mark the issue (in the Issue Pane
) as Existing. If it is no longer a problem, then close the ticket out with all of the updated information about it (such as an End Date). That way, while examining issues, you don’t have to wade through issues that you already know about and are ticketed. As of right now, it is not possible to have both the list of tickets and list of current issues open at once. As a way around this it may be worth opening two instances of QuARG and using one to look at and edit tickets, and the other to look at and mark issues. This is non-ideal, but is easier than flipping between the two within one application.
There are two ways to enter the Create Ticket window. One way is as listed above - through the Examine Issues window by clicking the Create Ticket
button at the top. The other way is through the Create Ticket
button in the View Tickets Tab
.
OR
Regardless of how you get there, you will be taken to the same place, where you will fill in a bunch of information and then submit the ticket:
There are 14 fields that can be filled in, with some being required and others optional.
Required Fields
These are the Required fields, and are the core pieces of information needed to track an issue in QuARG:
field | Description | |
---|---|---|
Tracker | required | Dropdown menu with two options: Data Problems or Support . Most tickets will likely belong in Data Problems , but Support is useful for recurring or persistent signals that trigger certain thresholds but are not true or fixable problems. |
Subject | required | Text input. A short description of the issue, such as “Data Outage” or “Masses off Center” for example. |
Status | required | Dropdown menu with the following options to indicate the status of the issue/ticket: New , In Progress , Resolved , Closed , Rejected |
Category | required | Dropdown menu with the following options to indicate the general category of the issue: Amplitudes , Completeness , Metadata , Other , Timing |
N | required | Text input. The network(s) affected by the issue described in the ticket. Can be wildcarded or a comma-separated list, and the use of square brackets (eg, A[KV]) is acceptable, but beware that when doing a SQL query for the target later (during report generation), complex targets can be difficult to match against. |
S | required | Text input. The station(s) affected by the issue described in the ticket. Can be wildcarded or a comma-separated list, and the use of square brackets (eg, IM0[1234]) is acceptable, but beware that when doing a SQL query for the target later (during report generation), complex targets can be difficult to match against. |
L | required | Text input. The location(s) affected by the issue described in the ticket. Note, if there is no location code (“blank” code) then you should instead use “–”. Leaving the field blank will prevent the ticket from being written. Can be wildcarded or a comma-separated list, and the use of square brackets (eg, 0[01]) is acceptable, but beware that when doing a SQL query for the target later (during report generation), complex targets can be difficult to match against. |
C | required | Text input. The channel(s) affected by the issue described in the ticket. Can be wildcarded or a comma-separated list, and the use of square brackets (eg, BH[12Z]) is acceptable, but beware that when doing a SQL query for the target later (during report generation), complex targets can be difficult to match against. |
Then there are the optional fields, which supplement the required fields. In most cases, these fields will be filled with information, but it’s not always necessary that they are:
field | Description | |
---|---|---|
Description | Text input. A more detailed description of the issue. | |
Image(s) | This will display an attached images, but to add or remove images then use the Add/View Image(s) button (see below). If provided, the image will be copied locally and included in the report. |
|
Link(s) | This will display any supplied links, but to add or remove links then use the Add/Edit Link(s) button (see below). If there is a link that ought to be included in the htlm report, it can be added here. For example, a link to PDFs or spectrograms can be helpful. |
|
Start | Text input. Date that the issue began, formatted as YYYY-mm-dd. If unknown, it can be left blank. | |
End | Text input. Date that the issue ended or was resolved, formatted as YYYY-mm-dd. If unknown, or still in progress, can be left blank. | |
Thresholds | Selectable list. The thresholds that were flagged and led to the discovery of the issue. Multiple thresholds can be selected. |
Add/View Image(s)
To add or remove images, add captions, or view images, you use theAdd/View Image(s)
button. This will supply a popup that looks like this: Existing Caption
section (below left)New Caption
input box (to the right) so that you can easily edit existing captionsNew Caption
input to selected image from the List of ImagesWhen an image is added here, it will be reflected in the Images(s)
field in the Create Ticket Form.
Add/Edit Links(s)
Add/Edit Link(s)
button. This generates a popup: When a link is added here, it will be reflected in the Link(s)
field in the Create Ticket Form.
Import Notes as Description
- If a line is selected in the Examine Issues window, this will take the Notes
field and import that into the Description field of the ticket. When multiple lines are selected, it will import the description if all Notes
match. If there are multiple values in Notes
, then it will spit out a message to the command line indicating this and will not import the Description.Import Target
- If a line is selected in the Examine Issues window, this will take the Target
field and import that into the Target field of the ticket. When multiple lines are selected, it will combine the targets into the most concise target it is able to.Clear Fields
- Clears all fields in the ticketing form.Clear Selection
- Clears all selections from the Thresholds
list.Submit
- Submits the ticket to the SQLite database. To edit or remove an existing ticket, you will need to use the See Tickets
button in the View Tickets tab, described below.To add a ticket, you need to hit Submit - this will add it to the SQLite ticketing database and clear all fields, prepping it for the next ticket.
To update a ticket, you use a very similar but slightly different Update Ticket Form
To see a list of all tickets that exist in the ticketing system, or some subset of tickets, use the View Tickets Tab
See Tickets
button is pushed. No fields are required here, and if everything is left blank then all tickets will be retrieved.At the top is the same Preference File
field that exists on all other tabs, but it is optional here. If the preference file has already been loaded within another tab, it will also be displayed here. The Browse
and Autofill
buttons do the same as in other tabs as well - Browse
will popup a file listing where you can navigate to and select the preference file. Autofill
will take values from the preference file and fill in the target-related fields (Network
, Stations
, Locations
, and Channels
).
Updated
dateUpdated
dateStart Date
dateStart Date
dateEnd Date
dateEnd Date
dateEach pairing of On/Before and On/After buttons are associated with the field directly to their left and the selected button will display as blue. If all time spans (before and after) are desired, then leave the date field empty.
Near the bottom, there are two buttons: Create Ticket
and See Tickets
. Create Tickets
leads you to the form to generate a new ticket, as described above. See Tickets
will use all provided inputs and find any matching tickets, which it will then display in a popup. Any line in that popup can be clicked on to view the ticket in more detail.
Tip: If See Tickets
isn’t displaying tickets that you know are in the system, then try loosening the query requirements. Removing target constraints, especially on the channel level, may be able to retrieve the ticket(s) you are looking for. QuARG has some fairly strict requirements for the target fields, so while it is able to understand a considerable range of notations (*, ?, [], comma), if either the ticket or the query have characters outside of QuARG’s capabilities, then it may not be able to retrieve those tickets easily.
Selecting the See Tickets
button will display a list of all tickets that match the criteria you provided. In the list, each line is selectable and if selected will take you to another window where you can view the ticket in more detail.
This list was generated using the following fields (all others left as blank or “See All”): Network
=IU, Channels
=BH*.
This screen displays the following: id (the id number of the ticket within the database table), Target, Start Date, End Date, Subject, Status, Tracker, and Updated date.
In the top right corner, there is a dropdown menu that allows you to sort the ticket by a number of different fields. By default tickets will sort by id, but selecting any of the other categories will immediately sort the tickets by that field. Once the sorting has been set, it will continue to be sort in that way until you either choose another sorting option, or close QuARG.
By clicking on any line, it will take you to another page to view the ticket in more detail:
When you initially click on the ticket, it’ll open the Update Ticket form with all fields from the ticket loaded up. Below, I have used “Example Ticket 1” from the ticket list above:
In this case, fields that have information are the Tracker, Subject, Description, Status, Category, and Target Fields (N,S,L,C), Image(s), Start, and Thresholds. All others were left blank when the ticket was initially created or last updated.
The fields in this form can be edited just like when creating a ticket. If you make changes to the ticket, you can save those changes by clicking on the Update
button in the bottom left. This will update the ticket in the database to reflect any you changes you made. For example, if the issue is resolved you will want to update the Status from New to Resolved. You may also want to add an End date that reflects when the station returned to normal.
NOTE: As the ticket is updated (the Update
button is pressed) the “Last Updated” datetime displayed at the top will change to reflect the time when that ticket was updated and a popup will let you know that it was able to save successfully.
Alternatively, you can delete the ticket by using the Delete Ticket
button. We suggest keeping tickets after an issue is resolved, but there are times when deleting a ticket makes sense. For example, if a ticket is accidentally generated or multiple tickets exist for the exact same issue.
Once all issues have been ticketed, and any old tickets have been closed or resolved, it is time to generate the report. The bulk of the report, the most important part of it, is derived directly from the tickets that are selected to be included. So it is important to make sure that your tickets include the type and volume of information that you want for your report.
Because you can use either the QuARG ticketing system or your favorite external ticketing system, the prgram utilizes a text file that translates between the tickets and the final report. This way, you can either use the internal tickets and QuARG will take care of generating this file, or you can produce the file manually and supply it to QuARG to generate the final report. The path to this file (whether created internally or externally) is a required input for creating a report, since it needs to know where to load the text file from.
All of this is done using the Generate Report tab:
Much like the View Tickets Tab, this one has fields that help to subset the tickets that the program will retrieve and use in the Report. While every field except for Preference File
, Directory
and CSV File
are technically optional, if you do not provide some constraints then the report will include all issues, past and present, indiscriminately. So, at a very minimum, it is a good idea to subset based on the Start
and End
dates.
The following fields are available to edit:
Directory
and CSV Files
from the Preference File.Available fields for subsetting include the usuals:
On/Before
and On/After
buttons; format YYYY-mm-ddOn/Before
and On/After
buttons; format YYYY-mm-ddIf you are interested only in issues that still persist at the time that the report is generated, then you can select New
and In Progress
(depending on what you use in your tickets) and leave the Start
and End
fields blank.
If you are interested in any and all issues that were active during the reporting period, then you can select all Status
options and have an End
(On/After
) of the first day of the report period. Since a blank End in a ticket will match to any End query here, this will pull up all cases where there is either no end time for the ticket or it ended sometime during the reporting period, regardless of when it started.
Ticket File
as described in the Output Files section.Then you do not have to do anything special - when you hit the Generate Report
button, QuARG will query the database using the Optional fields above and write out to the Directory
/CSV Filename
, then immediately use that file to generate the final report.
If either file already exists - either the intermediary csv file or the final report - when you Generate Report
, QuARG will ask you if you want to rewrite that file. If both files already exist, then it will ask you twice: once for the csv file, then for the final report.
Then you must supply the file at Directory
/CSV Filename
yourself. See Ticket File
to learn more about the format and fields required if using an external system. When you hit Generate Report
, QuARG will look for this file and then generate the final report from it.
If the final report already exists when you Generate Report
, it will ask if you want to rewrite it.
In the end, the report will exist in the Directory
alongside the issue and ticket files. If there are images in the included tickets, they will be copied locally and the report and all images will live in a directory together, along with a zipped version of that directory for easier sharing. This means that, in the end, the directory will either contain:
No Images | Images Included |
---|---|
Issue File | Issue File |
Ticket File | Ticket File |
Final Report | Subdirectory with - local copy of all images - the final report |
Zipped version of the subdirectory | |
Over the course of the full QA process (Find issues, Examine issues, Ticket issues, and Generate Report) there a few files that will be generated. For the most part, the user does not need to be overly aware of the intermediate files, but it is important to have an understanding of the purpose of each file.
This is the file that contains a list of all potential issues discovered by using the MUSTANG metrics and thresholds. The precise name and location of the file is determined by either the Directory
and Issue File
fields. Remember that any value input into the program will override the values in the Preference File.
The contents of this file is intended to be read into the the Examine Issues tool of QuARG. As such, it is not the most human-readable though it is certainly possible to decipher the contents of the file. Each field is separated by a pipe ‘|’, with the following fields:
field | Description |
---|---|
threshold | The threshold that was triggered for this particular target for these days |
target | Affected target |
start day | First of consecutive affected days - any consecutive days are grouped together into a single line |
end day | Final of affected days - any consecutive days are grouped together into a single line |
number of days | Number of consecutive affected days |
value | the value of any average or median thresholds |
status | Status of issue - this begins as “TODO” and will change as the status is updated using the Examine Issues tool |
notes | Any notes you want to leave for yourself about the issue to help keep track of what you know |
As the analyst works through the issues using the Examine Issues tool, any updates will be recorded in this file. That means that this file is where the analyst truly keeps track of the progress they make as they go through the issue list. Be careful to not accidentally delete this file, as all progress will then be lost.
In addition there is a backup copy of the issue file. This file has the same name as the original file, but with an additional .bck extension: backup_copy = output_file + '.bck'
After all issues have been examined and real issues ticketed, as well as existing tickets updated to reflect any potential changes since the previous report, there is an intermediate file that bridges the gap between the tickets and the final report. If the analyst is using the built in ticketing system, which is the most likely case, the file is generated using the Generate Report button in the Generate Report tab.
The file is a comma separated value file that includes information that will be put into the report. At the top is a header with the fields. Much like the Issue File, it is not meant to be used by humans directly but instead read back into QuARG as an input to the Generate Report script (using the Generate Report button in the Generate Report tab).
The file includes the following fields:
field | Description |
---|---|
ID | Ticket ID number |
Tracker | Which tracker the ticket was assigned to |
Target | Target(s) covered by that ticket |
Start Date | Start date of identified issue. May be left blank if unknown |
Category | The general category that the issue falls under. For example: Amplitudes, Completeness, Metadata, Timing, Other |
Subject | A few word description of the problem |
Thresholds | Thresholds used to discover the issue |
Images | Path to the image, if included |
Image Captions | Caption for the image, if included |
Links | URLS to be included as links in the report |
Status | Status of ticket. If using internal ticketing systen, statuses include: New, In Progress, Resolved, Closed, Rejected |
End Date | Date that issue was resolved, if applicable |
Description | A detailed description of the issue |
In the case where the analyst would prefer to use an external ticketing system, QuARG is able to absorb the ticket information through the use of this CSV file. This file is ingested and converted into the final HTML Report, but to do so it must be of a very specific format. If using the internal ticketing system, then you do not need to know the details of this file.
At the very top of the file is the following header:
id,tracker,target,start_date,category,subject,thresholds,images,caption,links,status,end_date,description
The lines that come after that follow that pattern, with quotation marks (‘"’) around any fields that may have a comma in them. For example:
id,tracker,target,start_date,category,subject,thresholds,images,caption,links,status,end_date,description
4,Support,UU BEI 01 EHZ,2019-12-01,Other,Example Ticket,"gapsRatioGt12, glitch",/Users/laura/QA_reports/testImage.jpg,"This is a figure caption, with a comma so it has quotation marks",http://service.iris.edu/mustang/measurements/1/query?metric=percent_availability&net=YO&cha=?XH&format=text&nodata=404&orderby=start_asc,In Progress,2019-12-03,"This one has a start and end date, and a link!"
The most important thing is that the ticketing system used either has these fields, or has an equivalent, and that the tickets can be exported into a csv file of this format. Any missing fields can be left blank if necessary. For example, using a Redmine ticketing system, we are able to use the ‘Export to CSV’ function and choose what columns are exported. It may take an intermediate step to convert the CSV into the correct format, in which case it is probably worth setting up a workflow to do the conversion for you. Depending on the complexity, it might be worth delving into the code to change the required format - just be wary of doing that: it may create unintended consequences.
The HTML Report is the final file to come out of the process - it is the file that condenses all of the information that the analyst learned about the issues while sifting through them into a single, easy to navigate report. The report format is an .html file that will open in a browser, with a filename along the lines of NETWORK_Netops_Report_YYYYmm.html
.
If there are images included in the report, which is suggested, then all image files and the final report will be moved into a separate directory and zipped for easy transferral.
The Report contains four general sections:
broadband
is included in the instruments list in the preference file; will print the short period thresholds if short period
is included in the instruments list.On top of the files described above, there are a few truly transient files that are generated and destroyed in the process of the final report. If the program is stopped or crashes in the middle of generating the final report, one or more of these files may exist and ought to be removed. Without going into detail, these include:
summaryFile = outfile + '.summary'
, where outfile
is from the preference filedetailFile = outfile + '.detail'
, where outfile
is from the preference fileIn addition, there is a file that is created and destroyed when Finding Issues. If this file isn’t properly generated, or if it is somehow removed during the Find Issues process, then it will pop up with a warning.
failedMetrics.txt
keeps track of any and all metrics that produced an error when QuARG attempted to retrieve them. It is used to produce a popup that lists the failures so that the user is aware of the issue and can either run Find Issues again, or at minimum be aware that some thresholds may be incomplete because the metrics were not retrieved. If a warning is produced that the file failedMetrics.txt is missing, this could indicate that some process in QuARG crashed, and you may want to look at the comman line for more information about the problem.