INDEX
Explanations
positive and negative evaluations of situations or concepts
New Auto-Interp
Negative Logits
umpt
-0.16
bows
-0.15
idy
-0.14
ebin
-0.14
apsed
-0.14
xAF
-0.14
assen
-0.14
або
-0.14
opi
-0.14
íĦ
-0.14
POSITIVE LOGITS
agger
0.15
hen
0.15
umat
0.14
indefinite
0.14
uni
0.14
Uni
0.14
icies
0.14
Liberties
0.14
ooke
0.14
TEL
0.13
Activations Density 0.134%