INDEX
Explanations
phrases indicating the concept of searching for or naming entities
New Auto-Interp
Negative Logits
aille
-0.16
agu
-0.15
she
-0.15
aghetti
-0.13
anko
-0.13
asje
-0.13
they
-0.13
nite
-0.13
agine
-0.13
akh
-0.13
POSITIVE LOGITS
oneself
0.47
ourselves
0.46
herself
0.45
himself
0.45
themselves
0.44
yourself
0.43
ÑģебÑı
0.41
èĩªå·±
0.39
myself
0.39
zich
0.37
Activations Density 0.087%