INDEX
Explanations
phrases that indicate conclusions or endings in a narrative context
New Auto-Interp
Negative Logits
lick
-0.15
jt
-0.15
Ñĸдно
-0.14
jd
-0.14
abel
-0.14
_INTR
-0.14
atri
-0.13
à¥Ĥड
-0.13
ulty
-0.13
ÑĭÑĪ
-0.13
POSITIVE LOGITS
Scope
0.15
mov
0.14
enville
0.14
VES
0.14
νÏĮ
0.14
oblin
0.14
Kob
0.14
Klo
0.14
two
0.14
ingly
0.14
Activations Density 0.070%