INDEX
Negative Logits
local
0.42
political
0.41
quá
0.39
Parsing
0.39
Shortcuts
0.37
Melting
0.37
marca
0.37
LOCAL
0.37
fert
0.37
Sporting
0.37
POSITIVE LOGITS
хам
0.41
ET
0.39
χρει
0.39
идет
0.38
Ⲱ
0.38
듣
0.38
棉
0.38
ühle
0.37
紡
0.37
愴
0.37
Activations Density 0.000%