INDEX
Explanations
words related to significant actions or states of being
New Auto-Interp
Negative Logits
_marshall
-0.16
udad
-0.15
@g
-0.14
ende
-0.14
Integrated
-0.14
Ùħس
-0.14
yme
-0.14
kl
-0.14
iou
-0.14
afka
-0.14
POSITIVE LOGITS
ients
0.16
onga
0.15
dge
0.15
anki
0.15
ivia
0.15
apia
0.15
hilar
0.15
ropical
0.14
uja
0.14
neutral
0.14
Activations Density 0.018%