INDEX
Explanations
content related to social issues, particularly racism and discrimination
New Auto-Interp
Negative Logits
للاسماء
-0.58
ślę
-0.55
TagMode
-0.51
oa̍t
-0.50
StringTokenizer
-0.49
>=",
-0.49
kysy
-0.47
tensione
-0.45
segni
-0.45
MLLoader
-0.45
POSITIVE LOGITS
hypocritical
1.01
laughable
0.94
idiotic
0.93
hypocrisy
0.90
phony
0.90
incompetent
0.90
propaganda
0.89
misguided
0.89
pathetic
0.88
delusional
0.86
Activations Density 1.068%