INDEX
Explanations
adjectives related to positive attributes
words that are phonetically playful or pun-like in nature
New Auto-Interp
Negative Logits
avia
-0.78
WAR
-0.77
aro
-0.69
ARC
-0.69
raped
-0.69
HI
-0.68
ynthesis
-0.67
chapter
-0.66
XIII
-0.65
endants
-0.65
POSITIVE LOGITS
ly
1.21
enough
1.13
ness
1.08
nesses
1.03
est
1.01
LY
0.90
NESS
0.89
ones
0.87
minded
0.84
liness
0.84
Activations Density 0.523%