INDEX
Explanations
various forms of the verb "be" in different contexts
New Auto-Interp
Negative Logits
ÑĭÑģ
-0.17
mia
-0.16
urm
-0.15
vided
-0.14
beck
-0.14
æ¡£
-0.14
erti
-0.14
undi
-0.13
kowski
-0.13
rine
-0.13
POSITIVE LOGITS
ardless
0.18
ying
0.17
sure
0.16
oga
0.15
eh
0.15
asts
0.15
Ù쨳
0.14
friend
0.14
ething
0.14
oted
0.14
Activations Density 0.061%