INDEX
Explanations
phrases related to safety
references to safety and related concepts
New Auto-Interp
Negative Logits
issance
-0.95
eric
-0.86
naire
-0.83
ement
-0.80
ette
-0.79
yss
-0.77
eta
-0.75
sonian
-0.75
iguous
-0.72
ional
-0.72
POSITIVE LOGITS
safety
0.84
hazards
0.83
practition
0.78
margins
0.78
ailability
0.77
net
0.76
precautions
0.74
valve
0.74
hazard
0.73
margin
0.72
Activations Density 0.039%