INDEX
Explanations
loan words or historical phrases, often French or Latin
New Auto-Interp
Negative Logits
Theſe
-0.77
argint
-0.73
κτηρισ
-0.71
cetines
-0.66
IntoConstraints
-0.66
stiefel
-0.65
Gallimard
-0.65
sweter
-0.64
umbro
-0.63
Hozzáférés
-0.62
POSITIVE LOGITS
relle
0.54
È
0.52
CURLOPT
0.50
illon
0.49
velle
0.46
ปร
0.46
Sept
0.45
lein
0.45
کل
0.45
langu
0.45
Activations Density 0.218%