INDEX
Explanations
descriptions related to physical appearance, particularly focusing on hairstyles and grooming characteristics
New Auto-Interp
Negative Logits
corred
-0.57
sob
-0.51
ActionButton
-0.50
ThroughAttribute
-0.49
fulfilled
-0.48
Cory
-0.47
Maren
-0.47
pare
-0.47
thrill
-0.47
corp
-0.46
POSITIVE LOGITS
hair
1.08
hairstyle
1.01
hairstyles
0.99
hairst
0.96
Hair
0.96
HAIR
0.95
hairdresser
0.94
Hair
0.89
haired
0.81
hair
0.81
Activations Density 0.110%