INDEX
Explanations
references to legal or academic citations
New Auto-Interp
Negative Logits
jména
-0.46
cerimonia
-0.42
hjelp
-0.39
politiet
-0.39
kvinder
-0.38
héroe
-0.38
ín
-0.37
duros
-0.36
fjor
-0.36
avancée
-0.36
POSITIVE LOGITS
aga
0.54
asser
0.54
ager
0.54
lab
0.53
itt
0.53
yszcz
0.53
rub
0.52
itter
0.52
rab
0.52
ull
0.52
Activations Density 0.427%