INDEX
Explanations
specific determiners or pronouns associated with identity
the very, same, entire, first
New Auto-Interp
Negative Logits
pravi
-0.35
rxjs
-0.34
nearby
-0.31
lagi
-0.30
εκ
-0.29
ที
-0.28
veliko
-0.28
ให
-0.27
kollu
-0.27
alway
-0.27
POSITIVE LOGITS
+#+#
0.73
rungsseite
0.71
ſicht
0.68
ロウィン
0.66
tartalo
0.66
autorytatywna
0.65
zwiſchen
0.65
0.65
パンチラ
0.65
OrNil
0.65
Activations Density 0.046%