INDEX
Explanations
words related to escalating or increasing actions
New Auto-Interp
Negative Logits
ternity
-0.75
pedia
-0.68
Samar
-0.68
circumst
-0.68
creen
-0.67
ĨĴ
-0.66
nsic
-0.66
KNOWN
-0.65
Confederation
-0.64
ä¸ī
-0.63
POSITIVE LOGITS
aging
1.24
arts
1.07
ages
1.04
up
1.00
antly
0.98
aged
0.98
hetamine
0.95
aign
0.92
inson
0.89
age
0.87
Activations Density 0.019%