INDEX
Explanations
terms related to anti-discrimination and socio-political issues
New Auto-Interp
Negative Logits
^(@)
-0.91
nahilalakip
-0.84
httphttps
-0.84
expandindo
-0.82
дописавши
-0.81
}*/
-0.81
InSection
-0.77
."]
-0.76
……"
-0.74
IonicModule
-0.73
POSITIVE LOGITS
さな
0.52
my
0.50
;
0.49
my
0.49
(
0.49
1
0.48
“
0.48
dona
0.47
g
0.47
tti
0.46
Activations Density 0.147%