INDEX
Negative Logits
abouts
-0.84
abases
-0.67
tein
-0.64
inet
-0.63
heastern
-0.62
existent
-0.62
nels
-0.61
Administ
-0.60
peer
-0.59
nsic
-0.58
POSITIVE LOGITS
surprise
1.25
shock
0.96
surprises
0.94
Surprise
0.91
surpr
0.90
unexpected
0.87
surprising
0.86
disappointment
0.81
shock
0.81
revelation
0.79
Activations Density 0.084%