INDEX
Explanations
direct speech or quotes from individuals
New Auto-Interp
Negative Logits
phans
-0.17
ÑĪев
-0.16
ardon
-0.16
zac
-0.15
zoom
-0.15
енÑģ
-0.14
abyrin
-0.14
oren
-0.14
_ctxt
-0.14
ofs
-0.14
POSITIVE LOGITS
elon
0.14
anch
0.14
Brill
0.14
онÑĮ
0.14
Mgr
0.14
çł
0.14
agos
0.14
agi
0.14
utex
0.14
agine
0.14
Activations Density 0.021%