INDEX
Explanations
descriptions of physical appearances and characteristics
references to human-like appearances or characteristics
New Auto-Interp
Negative Logits
Sources
-0.82
nesday
-0.80
liga
-0.79
ciplinary
-0.73
piracy
-0.73
Fund
-0.72
Free
-0.70
ETF
-0.70
Mandatory
-0.69
Ô
-0.69
POSITIVE LOGITS
attire
1.63
complexion
1.62
hairst
1.61
facial
1.59
silhouette
1.57
appearance
1.55
likeness
1.55
physique
1.55
demeanor
1.54
tattoos
1.49
Activations Density 0.516%