INDEX
Explanations
statements emphasizing safety and security as top priorities
statements related to priorities concerning safety and security
New Auto-Interp
Negative Logits
ugh
-0.73
satell
-0.70
ozyg
-0.70
gobl
-0.70
dup
-0.67
traces
-0.66
uge
-0.66
prints
-0.65
behaved
-0.65
ipp
-0.64
POSITIVE LOGITS
priority
1.64
paramount
1.52
priorities
1.36
priority
1.33
Priority
1.13
overriding
1.06
imperative
1.03
priorit
1.01
cornerstone
0.99
foremost
0.98
Activations Density 0.304%