INDEX
Explanations
references to prohibitions or restrictions
New Auto-Interp
Negative Logits
essentiel
-0.64
ertale
-0.63
equipe
-0.62
ryb
-0.60
Arteta
-0.60
":[{-0.60
core
-0.59
الوس
-0.59
%)$
-0.58
Dyke
-0.58
POSITIVE LOGITS
bans
1.39
ban
1.38
Ban
1.34
Ban
1.30
BAN
1.26
banning
1.26
Bann
1.21
BAN
1.18
banish
1.12
Bans
1.11
Activations Density 0.105%