INDEX
Explanations
references to bans and restrictions
New Auto-Interp
Negative Logits
بيها
-0.81
mtext
-0.73
s
-0.71
p
-0.67
tu
-0.62
mes
-0.59
e
-0.58
付
-0.58
Everett
-0.56
numberWith
-0.56
POSITIVE LOGITS
bans
1.43
Bans
1.34
Bann
1.28
banning
1.28
Bans
1.26
banish
1.15
Banerjee
1.13
RemoveField
1.11
bans
1.11
banned
1.10
Activations Density 0.141%