INDEX
Explanations
adjectives related to physical characteristics or qualities
adjectives describing various characteristics and qualities
New Auto-Interp
Negative Logits
accordingly
-0.74
`.
-0.73
SPONSORED
-0.70
Shards
-0.68
inctions
-0.68
.''.
-0.67
RELE
-0.65
Strongh
-0.65
Reloaded
-0.65
Tokens
-0.65
POSITIVE LOGITS
-
1.01
bred
0.98
headed
0.95
faced
0.94
haired
0.94
hearted
0.91
blonde
0.86
looking
0.86
bearded
0.85
neck
0.83
Activations Density 0.346%