Event-by-event trigger database
The ICARUS trigger database stores all the event-by-event information saved in the trigger fragments.
It lives on the ICARUS online cluster alongside the other PostgreSQL DCS and slow controls databases.
They primary host is currently icarus-db02
(with a backup on icarus-db03
), but it is also replicated offline in read-only format on ifdbdaqrep01
.
POT accounting requires matching ICARUS events with IFBEAM events.
While LArSoft events store the same trigger information, events are not saved sequentially in the DAQ output files.
This is both due to the different data streams and the number of EventBuilder processes (given n
EventBuilders, event j
is packaged by the (j%n)th
one).
As a result, POT accounting would require accessing all the files in a run just to sort the triggers by time.
The trigger database was developed to simplify these operations in the offline, providing an easy way to query and sort the collected triggers.
The scripts supporting and automatically filling the trigger database are in the following repository: https://github.com/SBNSoftware/icarus-trigger-database. The initial implementation was developed by Justin Mueller at https://github.com/justinjmueller/icarus_runinfo_database.
Data flow
The database is currently filled by reading the trigger boardreader logfiles.
This happens daily via cronjobs running on icarus-evb06
.
A more robust implementation is being planned by having the database filled directly by an analyzer module in the main DAQ code.
- Trigger logfiles are created for every run in
/daq/log/boardreader/run*-icarustrigger.log
- Every day
backupTriggerLogs.sh
creates a compressed copy of any new logfile in/home/nfs/icarus/TriggerLogBackup
.- Only logs older than 1 day are considered to avoid ongoing runs.
- Every day
runFillTriggerDatabase.sh
checks that backup directory and fills the trigger database accordingly.- File names already processed are skipped, unless the file size has changed.
- Stub files are skipped.
- The log is parsed for trigger data strings. Information missing in older trigger packet versions is defaulted to
-1
. - Logs and errors are saved to
/daq/log/triggerdb/
.
Tables
Trigger Data
The triggerdata
table encapsulates all information that exists at the per-trigger level. It corresponds to the contents of the trigger TCP/IP packet.
The primary key is the pair (run_number
,event_no
).
Column | Type | Description | Default |
---|---|---|---|
run_number |
integer | DAQ run number | |
version |
integer | Version numbering for the trigger string data | |
event_no |
integer | Event number | |
seconds |
integer | Local time stamp of the global trigger (seconds) | |
nanoseconds |
integer | Local time stamp of the global trigger (nanoseconds) | |
wr_event_no |
integer | Event number from the White Rabbit | |
wr_seconds |
integer | Time stamp of the global trigger (seconds) | |
wr_nanoseconds |
integer | Time stamp of the global trigger (nanoseconds) | |
enable_type |
integer | Enable gate type | |
enable_seconds |
integer | Time stamp of the enable gate (seconds) | |
enable_nanoseconds |
integer | Time stamp of the enable gate (nanoseconds) | |
gate_id |
integer | Number of the current gate | |
gate_type |
integer | Number encoding the type of gate (1: BNB, 2: NuMI, 3: BNBOffbeam, 4: NuMIOffbeam, 5: Calibration) | |
gate_id_bnb |
integer | Gate ID (BNB) | |
gate_id_numi |
integer | Gate ID (NuMI) | |
gate_id_bnboff |
integer | Gate ID (offbeam BNB) | |
gate_id_numioff |
integer | Gate ID (offbeam NuMI) | |
beam_seconds |
integer | Time stamp of the beam gate (seconds) | |
beam_nanoseconds |
integer | Time stamp of the beam gate (nanoseconds) | |
trigger_type |
integer | Type of trigger logic (0: Majority, 1: MinBias) | |
trigger_source |
integer | Originating cryostat of the trigger (0: Undecided, 1: East, 2: West, 7: Both) | |
cryo1_e_conn_0 |
text | 64-bit word with the status of the pairs of PMT discriminated signals (LVDS) for the EE wall | |
cryo1_e_conn_2 |
text | 64-bit word with the status of the pairs of PMT discriminated signals (LVDS) for the EW wall | |
cryo2_w_conn_0 |
text | 64-bit word with the status of the pairs of PMT discriminated signals (LVDS) for the WE wall | |
cryo2_w_conn_2 |
text | 64-bit word with the status of the pairs of PMT discriminated signals (LVDS) for the WW wall | |
cryo1_east_counts |
integer | Counters of other activity in coincidence with the gate (other potential global triggers in the event) for the East cryostat | |
cryo2_west_count |
integer | Counters of other activity in coincidence with the gate (other potential global triggers in the event) for the Wast cryostat | |
mj_adder_source_east |
integer | Enumeration of trigger source in the East cryostat, specifically adder vs. majority (1: adders, 2: majority, 7: both) | -1 |
mj_adder_source_west |
integer | Enumeration of trigger source in the West cryostat, specifically adder vs. majority (1: adders, 2: majority, 7: both) | -1 |
flag_east |
integer | Flags true if a trigger was detected in the East cryostat | -1 |
delay_east |
integer | Delay of the East trigger from the start of the beam gate in FPGA ticks (default: 90) | -1 |
flag_west |
integer | Flags true if a trigger was detected in the West cryostat | -1 |
delay_west |
integer | Delay of the West trigger from the start of the beam gate in FPGA ticks (default: 90) | -1 |
Trigger log
The triggerlog
table contains metadata associated with each trigger log file.
It is used internally to keep track of log files that have been processed already.
The primary key is log_name
.
Column | Type | Description |
---|---|---|
log_name |
text | The name of the log file |
stub |
bool | Boolean tagging the log file as containing no actual triggers |
file_size |
integer | Size of the file on disk |
run_number |
integer | Run number corresponding to the log file |
processed |
bool | Boolean tagging the log file as processed/not processed |
How to access
Online version
Direct access to the database in the online cluster is reserved for filling operations through the shared triggerdb_writer
user.
This is handled through the script configuration in https://github.com/SBNSoftware/icarus-trigger-database/blob/main/config.json.
For interactive access from icarus-gateway03
, type:
psql -W -U triggerdb_writer -h icarus-db02 -p 5434 -d icarus_trigger_prd
You will be prompted for a password.
Please contact Donatella Torretta or Matteo Vicenzi for the password.
Offline standby replica
Access for querying and analyzing trigger data should be performed via the offline read-only replica on ifdbdaqrep01
.
This is handled through the shared triggerdb_reader
user.
For interactive access from any icarusgpvm
host, type:
psql -W -U triggerdb_reader -h ifdbdaqrep01 -p 5455 -d icarus_trigger_prd
You will be prompted for a password.
Connection details to be used for Python/C++ libraries are:
db_params = {
'dbname': 'icarus_trigger_prd',
'user': 'triggerdb_reader',
'password': '******',
'host': 'ifdbdaqrep01.fnal.gov',
'port': '5455'
}
Please contact Donatella Torretta or Matteo Vicenzi for the password.
Access via Python script
The following example shows a way to access the off-line database replica. Here, selected entries are turned into a Pandas dataframe. The PostgreSQL database is accessed via SQLAlchemy API (2.0), which use psychopg (3) backend: all these packages need to be installed and operational.
In addition, the database server is not directly accessible, and a workaround is needed.
One is to open an SSH tunnel to the server, hopping through a Fermilab server we can access.
For example, using icarusgpvm03.fnal.gov
(after obtaining a Kerberos ticket):
ssh -x -L '5455:ifdbdaqrep01.fnal.gov:5455' -N "${USER}@icarusgpvm03.fnal.gov"
should open the tunner needed.
Here is the example, with some comments within:
import sqlalchemy as sql
import pandas
import time
ConnectionSettings = {
'database': 'icarus_trigger_prd',
'username': 'triggerdb_reader',
'password': '******',
'host': 'ifdbdaqrep01.fnal.gov',
'port': 5455,
}
TableName = 'triggerdata'
#
# Custom settings
#
# The password is obviously not published. About the host: it is not accessible from outside Fermilab.
# One way around it is to use SSH tunnelling: that we have access to icarusgpvm03.fnal.gov via Kerberos ticket
#
# ssh -x -L '5455:ifdbdaqrep01.fnal.gov:5455' -N "${USER}@icarusgpvm03.fnal.gov"
#
# will open a tunnel from the port 5455 (see ConnectionSettings['port'] above)
# of host ifdbdaqrep01.fnal.gov (see ConnectionSettings['host']),
# accessible via localhost.
#
ConnectionSettings['password'] = r'******'
ConnectionSettings['host'] = 'localhost'
#
# create an "engine" to create connections to the database with
#
DBconnectionURL = sql.URL.create('postgresql+psycopg', **ConnectionSettings)
DBengine = sql.create_engine(DBconnectionURL, echo=True)
print(f"{DBconnectionURL=}")
#
# make the table and its schema known to SQLAlchemy library
#
DBmetadata = sql.MetaData() # glorified dictionary of stuff that SQLAlchemy can track
# triggerDataTable = sql.Table(TableName, DBmetadata, autoload_with=DBengine) # this would load the whole table schema
# instead of declaring the whole table schema, we skip the part that we don't need
# by selecting columns ahead;
# all these columns are integers and there are two primary keys in the database
SelectedColumns = (
'run_number', 'event_no',
'wr_seconds', 'wr_nanoseconds',
'beam_seconds', 'beam_nanoseconds',
'gate_type', 'trigger_type', 'trigger_source',
'gate_id', 'gate_id_bnb', 'gate_id_numi', 'gate_id_bnboff', 'gate_id_numioff',
)
PrimaryKeys = { 'run_number', 'event_no' }
triggerDataTable = sql.Table(
TableName, DBmetadata,
*[ sql.Column(colName, sql.Integer, primary_key=(colName in PrimaryKeys), nullable=False)
for colName in SelectedColumns
],
)
#
# read the whole table (all available runs) into a Pandas dataframe, except:
# * calibration gate events
# * minimum bias events
# This query loads >11M events.
#
# The database, at the time of writing, has hundreds of millions of entries,
# and tens of gigabytes of data.
# Even with a fast connection and a lot of available memory, reading the whole
# thing is daunting.
# Limiting the range of runs via `run_number` column is often useful.
#
selector = sql.select(triggerDataTable).where(sql.and_(
sql.between(triggerDataTable.c.gate_type, 1, 4), # not calibration gate
triggerDataTable.c.trigger_type == 0, # light-based trigger
))
# print(f"Query:\n{selector}")
startTime = time.time()
with DBengine.connect() as DBconn:
df = pandas.read_sql_query(selector, DBconn)
print(f"Whoa! it took {time.time() - startTime:.1f}\" to load {len(df)} entries and {len(df.columns)} columns from the database!!!")
#
# combine the two timestamp pieces
#
df['triggerTimestamp'] = df.wr_seconds * 1_000_000_000 + df.wr_nanoseconds
df['beamGateTimestamp'] = df.beam_seconds * 1_000_000_000 + df.beam_nanoseconds # note: still includes the 4 us veto time
df['triggerFromBeamGate'] = df.triggerTimestamp - df.beamGateTimestamp
del df['wr_seconds'], df['wr_nanoseconds'], df['beam_seconds'], df['beam_nanoseconds'],
df