INDEX
Explanations
expressions of uncertainty or speculation
New Auto-Interp
Negative Logits
shi
-0.16
endoza
-0.15
indo
-0.15
apur
-0.15
whats
-0.15
INDIRECT
-0.14
/articles
-0.14
elerik
-0.14
Ñıн
-0.13
vik
-0.13
POSITIVE LOGITS
they
0.21
there
0.16
nobody
0.16
aken
0.15
UME
0.15
none
0.15
THEY
0.15
they
0.15
we
0.14
They
0.14
Activations Density 0.244%