INDEX
Explanations
actions and verbs related to approval and desire
New Auto-Interp
Negative Logits
iÄįka
-0.15
oons
-0.15
ép
-0.15
ãĥ³ãĥij
-0.14
cio
-0.14
awks
-0.14
achuset
-0.14
NORMAL
-0.13
çĭIJ
-0.13
-leg
-0.13
POSITIVE LOGITS
ande
0.15
ierce
0.14
boa
0.14
ONGL
0.14
__$
0.14
esse
0.14
dr
0.14
ız
0.14
ÑĮÑı
0.13
cape
0.13
Activations Density 0.040%