INDEX
Explanations
words related to temperature or emotions
expressions of warmth or warmth-related concepts
New Auto-Interp
Negative Logits
argon
-0.75
IMAGES
-0.74
FIL
-0.72
tumblr
-0.68
RECT
-0.67
dom
-0.67
sections
-0.67
issors
-0.66
doms
-0.66
Os
-0.65
POSITIVE LOGITS
warm
1.34
warm
1.24
achine
1.23
warmer
1.11
warmth
1.07
warming
1.02
Warm
1.01
warmed
0.97
fuzz
0.96
est
0.89
Activations Density 0.012%