INDEX
Explanations
negative descriptors and terms associated with social or political criticism
New Auto-Interp
Negative Logits
Personendaten
-0.67
فريبيس
-0.65
yntaxException
-0.62
Хьажоргаш
-0.54
Superhost
-0.52
IVEREF
-0.51
httphttps
-0.51
للاسماء
-0.51
matchCondition
-0.49
SPATH
-0.48
POSITIVE LOGITS
lixo
0.53
waste
0.50
inutile
0.49
stupid
0.48
basura
0.47
Stupid
0.43
stupid
0.42
pointless
0.42
useless
0.42
waste
0.41
Activations Density 0.644%