INDEX
Explanations
references to personal pronouns and possessive forms
followed by auxiliary verbs
subject-verb constructions
New Auto-Interp
Negative Logits
jména
-0.50
Ошибка
-0.49
적으로
-0.48
ciri
-0.46
okazji
-0.45
vian
-0.45
jší
-0.45
koop
-0.43
uny
-0.43
逊
-0.43
POSITIVE LOGITS
InjectAttribute
1.13
ostavi
0.98
verwijspagina
0.95
UnsafeEnabled
0.93
createSlice
0.91
лтемелер
0.90
__':
0.89
__':
0.86
nakalista
0.86
surla
0.86
Activations Density 0.649%