INDEX
Explanations
mentions of colors, particularly purple and pink
the color purple
New Auto-Interp
Negative Logits
Green
-0.48
green
-0.47
yeşil
-0.45
Green
-0.43
grüne
-0.43
verdes
-0.42
-------
-0.41
grünen
-0.40
green
-0.40
greenish
-0.39
POSITIVE LOGITS
liothèque
0.52
Tenggara
0.49
putExtra
0.48
awtextra
0.47
حوالہ
0.47
يتيمه
0.47
+#+#
0.46
RetentionPolicy
0.45
ñola
0.45
mobileqq
0.45
Activations Density 0.202%