INDEX
Explanations
emoticons and symbols, like arrows and percent signs
topics related to emotional distress or trauma
New Auto-Interp
Negative Logits
photoc
-0.76
seiz
-0.75
anwhile
-0.67
mathemat
-0.66
minim
-0.65
COP
-0.64
buy
-0.64
mete
-0.63
telev
-0.63
pastry
-0.62
POSITIVE LOGITS
Ļ
1.64
¤
1.26
ħ
1.22
ĺ
1.14
¶
1.10
¬
1.09
ĸ
1.04
µ
1.03
Ĩ
1.03
ĥ
1.03
Activations Density 0.351%