Online Trigger Database in ICARUS

Event-by-event trigger database

The ICARUS trigger database stores all the event-by-event information saved in the trigger fragments. It lives on the ICARUS online cluster alongside the other PostgreSQL DCS and slow controls databases. They primary host is currently icarus-db02 (with a backup on icarus-db03), but it is also replicated offline in read-only format on ifdbdaqrep01.

POT accounting requires matching ICARUS events with IFBEAM events. While LArSoft events store the same trigger information, events are not saved sequentially in the DAQ output files. This is both due to the different data streams and the number of EventBuilder processes (given n EventBuilders, event j is packaged by the (j%n)th one). As a result, POT accounting would require accessing all the files in a run just to sort the triggers by time. The trigger database was developed to simplify these operations in the offline, providing an easy way to query and sort the collected triggers.

The scripts supporting and automatically filling the trigger database are in the following repository: https://github.com/SBNSoftware/icarus-trigger-database. The initial implementation was developed by Justin Mueller at https://github.com/justinjmueller/icarus_runinfo_database.

Data flow

The database is currently filled by reading the trigger boardreader logfiles. This happens daily via cronjobs running on icarus-evb06. A more robust implementation is being planned by having the database filled directly by an analyzer module in the main DAQ code.

Trigger logfiles are created for every run in /daq/log/boardreader/run*-icarustrigger.log
Every day backupTriggerLogs.sh creates a compressed copy of any new logfile in /home/nfs/icarus/TriggerLogBackup.
- Only logs older than 1 day are considered to avoid ongoing runs.
Every day runFillTriggerDatabase.sh checks that backup directory and fills the trigger database accordingly.
- File names already processed are skipped, unless the file size has changed.
- Stub files are skipped.
- The log is parsed for trigger data strings. Information missing in older trigger packet versions is defaulted to -1.
- Logs and errors are saved to /daq/log/triggerdb/.

Tables

Trigger Data

The triggerdata table encapsulates all information that exists at the per-trigger level. It corresponds to the contents of the trigger TCP/IP packet. The primary key is the pair (run_number,event_no).

Column	Type	Description	Default
`run_number`	integer	DAQ run number
`version`	integer	Version numbering for the trigger string data
`event_no`	integer	Event number
`seconds`	integer	Local time stamp of the global trigger (seconds)
`nanoseconds`	integer	Local time stamp of the global trigger (nanoseconds)
`wr_event_no`	integer	Event number from the White Rabbit
`wr_seconds`	integer	Time stamp of the global trigger (seconds)
`wr_nanoseconds`	integer	Time stamp of the global trigger (nanoseconds)
`enable_type`	integer	Enable gate type
`enable_seconds`	integer	Time stamp of the enable gate (seconds)
`enable_nanoseconds`	integer	Time stamp of the enable gate (nanoseconds)
`gate_id`	integer	Number of the current gate
`gate_type`	integer	Number encoding the type of gate (1: BNB, 2: NuMI, 3: BNBOffbeam, 4: NuMIOffbeam, 5: Calibration)
`gate_id_bnb`	integer	Gate ID (BNB)
`gate_id_numi`	integer	Gate ID (NuMI)
`gate_id_bnboff`	integer	Gate ID (offbeam BNB)
`gate_id_numioff`	integer	Gate ID (offbeam NuMI)
`beam_seconds`	integer	Time stamp of the beam gate (seconds)
`beam_nanoseconds`	integer	Time stamp of the beam gate (nanoseconds)
`trigger_type`	integer	Type of trigger logic (0: Majority, 1: MinBias)
`trigger_source`	integer	Originating cryostat of the trigger (0: Undecided, 1: East, 2: West, 7: Both)
`cryo1_e_conn_0`	text	64-bit word with the status of the pairs of PMT discriminated signals (LVDS) for the EE wall
`cryo1_e_conn_2`	text	64-bit word with the status of the pairs of PMT discriminated signals (LVDS) for the EW wall
`cryo2_w_conn_0`	text	64-bit word with the status of the pairs of PMT discriminated signals (LVDS) for the WE wall
`cryo2_w_conn_2`	text	64-bit word with the status of the pairs of PMT discriminated signals (LVDS) for the WW wall
`cryo1_east_counts`	integer	Counters of other activity in coincidence with the gate (other potential global triggers in the event) for the East cryostat
`cryo2_west_count`	integer	Counters of other activity in coincidence with the gate (other potential global triggers in the event) for the Wast cryostat
`mj_adder_source_east`	integer	Enumeration of trigger source in the East cryostat, specifically adder vs. majority (1: adders, 2: majority, 7: both)	`-1`
`mj_adder_source_west`	integer	Enumeration of trigger source in the West cryostat, specifically adder vs. majority (1: adders, 2: majority, 7: both)	`-1`
`flag_east`	integer	Flags true if a trigger was detected in the East cryostat	`-1`
`delay_east`	integer	Delay of the East trigger from the start of the beam gate in FPGA ticks (default: 90)	`-1`
`flag_west`	integer	Flags true if a trigger was detected in the West cryostat	`-1`
`delay_west`	integer	Delay of the West trigger from the start of the beam gate in FPGA ticks (default: 90)	`-1`

Trigger log

The triggerlog table contains metadata associated with each trigger log file. It is used internally to keep track of log files that have been processed already. The primary key is log_name.

Column	Type	Description
`log_name`	text	The name of the log file
`stub`	bool	Boolean tagging the log file as containing no actual triggers
`file_size`	integer	Size of the file on disk
`run_number`	integer	Run number corresponding to the log file
`processed`	bool	Boolean tagging the log file as processed/not processed

How to access

Online version

Direct access to the database in the online cluster is reserved for filling operations through the shared triggerdb_writer user. This is handled through the script configuration in https://github.com/SBNSoftware/icarus-trigger-database/blob/main/config.json.

For interactive access from icarus-gateway03, type:

psql -W -U triggerdb_writer -h icarus-db02 -p 5434 -d icarus_trigger_prd

You will be prompted for a password.

Please contact Donatella Torretta or Matteo Vicenzi for the password.

Offline standby replica

Access for querying and analyzing trigger data should be performed via the offline read-only replica on ifdbdaqrep01. This is handled through the shared triggerdb_reader user.

For interactive access from any icarusgpvm host, type:

psql -W -U triggerdb_reader -h ifdbdaqrep01 -p 5455 -d icarus_trigger_prd

You will be prompted for a password.

Connection details to be used for Python/C++ libraries are:

db_params = {
  'dbname': 'icarus_trigger_prd',
  'user': 'triggerdb_reader',
  'password': '******',
  'host': 'ifdbdaqrep01.fnal.gov',
  'port': '5455'
}

Please contact Donatella Torretta or Matteo Vicenzi for the password.

Access via Python script

The following example shows a way to access the off-line database replica. Here, selected entries are turned into a Pandas dataframe. The PostgreSQL database is accessed via SQLAlchemy API (2.0), which use psychopg (3) backend: all these packages need to be installed and operational.

In addition, the database server is not directly accessible, and a workaround is needed. One is to open an SSH tunnel to the server, hopping through a Fermilab server we can access. For example, using icarusgpvm03.fnal.gov (after obtaining a Kerberos ticket):

ssh -x -L '5455:ifdbdaqrep01.fnal.gov:5455' -N "${USER}@icarusgpvm03.fnal.gov"

should open the tunner needed.

Here is the example, with some comments within:

import sqlalchemy as sql
import pandas
import time

ConnectionSettings = {
  'database': 'icarus_trigger_prd',
  'username': 'triggerdb_reader',
  'password': '******',
  'host':     'ifdbdaqrep01.fnal.gov',
  'port':      5455,
}
TableName = 'triggerdata'

#
# Custom settings
# 
# The password is obviously not published. About the host: it is not accessible from outside Fermilab.
# One way around it is to use SSH tunnelling: that we have access to icarusgpvm03.fnal.gov via Kerberos ticket
#     
#     ssh -x -L '5455:ifdbdaqrep01.fnal.gov:5455' -N "${USER}@icarusgpvm03.fnal.gov"
#     
# will open a tunnel from the port 5455 (see ConnectionSettings['port'] above)
# of host ifdbdaqrep01.fnal.gov (see ConnectionSettings['host']),
# accessible via localhost.
# 
ConnectionSettings['password'] = r'******'
ConnectionSettings['host'] = 'localhost'

#
# create an "engine" to create connections to the database with
#
DBconnectionURL = sql.URL.create('postgresql+psycopg', **ConnectionSettings)
DBengine = sql.create_engine(DBconnectionURL, echo=True)
print(f"{DBconnectionURL=}")

#
# make the table and its schema known to SQLAlchemy library
#
DBmetadata = sql.MetaData()  # glorified dictionary of stuff that SQLAlchemy can track

# triggerDataTable = sql.Table(TableName, DBmetadata, autoload_with=DBengine)  # this would load the whole table schema

# instead of declaring the whole table schema, we skip the part that we don't need
# by selecting columns ahead;
# all these columns are integers and there are two primary keys in the database

SelectedColumns = (
  'run_number', 'event_no',
  'wr_seconds', 'wr_nanoseconds',
  'beam_seconds', 'beam_nanoseconds',
  'gate_type', 'trigger_type', 'trigger_source',
  'gate_id', 'gate_id_bnb', 'gate_id_numi', 'gate_id_bnboff', 'gate_id_numioff',
)
PrimaryKeys = { 'run_number', 'event_no' }

triggerDataTable = sql.Table(
  TableName, DBmetadata,
  *[ sql.Column(colName, sql.Integer, primary_key=(colName in PrimaryKeys), nullable=False)
    for colName in SelectedColumns
    ],
  )

#
# read the whole table (all available runs) into a Pandas dataframe, except:
#  * calibration gate events
#  * minimum bias events
# This query loads >11M events.
#
# The database, at the time of writing, has hundreds of millions of entries,
# and tens of gigabytes of data.
# Even with a fast connection and a lot of available memory, reading the whole
# thing is daunting.
# Limiting the range of runs via `run_number` column is often useful.
#
selector = sql.select(triggerDataTable).where(sql.and_(
  sql.between(triggerDataTable.c.gate_type, 1, 4), # not calibration gate
  triggerDataTable.c.trigger_type == 0,            # light-based trigger
  ))
# print(f"Query:\n{selector}")
startTime = time.time()
with DBengine.connect() as DBconn:
  df = pandas.read_sql_query(selector, DBconn)

print(f"Whoa! it took {time.time() - startTime:.1f}\" to load {len(df)} entries and {len(df.columns)} columns from the database!!!")

#
# combine the two timestamp pieces
#
df['triggerTimestamp']    = df.wr_seconds   * 1_000_000_000 + df.wr_nanoseconds
df['beamGateTimestamp']   = df.beam_seconds * 1_000_000_000 + df.beam_nanoseconds  # note: still includes the 4 us veto time
df['triggerFromBeamGate'] = df.triggerTimestamp - df.beamGateTimestamp
del df['wr_seconds'], df['wr_nanoseconds'], df['beam_seconds'], df['beam_nanoseconds'], 

df