INDEX
Explanations
words related to physical appearance, particularly focusing on weight and body shape
terms related to body size and physical appearance
New Auto-Interp
Negative Logits
ative
-0.88
iller
-0.88
esian
-0.83
inations
-0.83
enary
-0.80
argon
-0.79
ataka
-0.79
atography
-0.78
ophers
-0.78
endez
-0.77
POSITIVE LOGITS
bum
0.96
LOAD
0.85
cheeks
0.80
Bey
0.75
belly
0.74
Fool
0.70
GROUND
0.70
button
0.70
buck
0.69
poke
0.69
Activations Density 0.054%