INDEX
Explanations
expressions of personal identity and individual experience
New Auto-Interp
Negative Logits
ÑģÑĤи
-0.14
ervas
-0.14
apiro
-0.14
Ñĸдно
-0.14
.annotate
-0.13
arios
-0.13
ãĥ£
-0.13
sortOrder
-0.12
assi
-0.12
Æ¡
-0.12
POSITIVE LOGITS
talking
0.75
referring
0.73
refer
0.69
refers
0.60
Talking
0.56
talk
0.55
Talking
0.54
reference
0.53
refer
0.49
speaking
0.48
Activations Density 0.152%