INDEX
Explanations
words related to importance or significance
terms related to importance and urgency
New Auto-Interp
Negative Logits
rity
-0.83
ancies
-0.82
ramids
-0.80
adders
-0.79
istries
-0.77
chambers
-0.77
ignt
-0.76
azines
-0.75
apons
-0.75
poons
-0.75
POSITIVE LOGITS
breaker
0.88
starter
0.86
deterrent
0.85
unto
0.84
unless
0.80
punishable
0.76
occurrence
0.75
because
0.75
stay
0.75
anymore
0.75
Activations Density 0.195%