INDEX
Explanations
phrases and sentences that express criticism or controversy
New Auto-Interp
Negative Logits
gnore
-0.20
especialmente
-0.19
çī¹åĪ«
-0.18
surtout
-0.16
especially
-0.16
pecially
-0.16
íĬ¹íŀĪ
-0.15
especially
-0.15
Especially
-0.15
particularly
-0.15
POSITIVE LOGITS
sounds
0.36
Sounds
0.35
Sounds
0.32
sounds
0.30
sound
0.29
neat
0.29
Sound
0.27
nice
0.26
sounded
0.26
cool
0.26
Activations Density 0.293%