INDEX
Explanations
references to safety measures or concerns
mentions of safety and related concepts
New Auto-Interp
Negative Logits
issance
-0.93
eric
-0.88
yss
-0.83
naire
-0.82
ette
-0.81
estine
-0.77
iguous
-0.76
eta
-0.76
ement
-0.75
nai
-0.75
POSITIVE LOGITS
precautions
0.84
hazards
0.83
net
0.80
valve
0.78
safety
0.78
hazard
0.76
practition
0.74
valves
0.72
safety
0.72
violations
0.71
Activations Density 0.055%