INDEX
Explanations
terms associated with creativity and relative comparisons
New Auto-Interp
Negative Logits
ness
-0.88
nya
-0.85
ers
-0.79
r
-0.78
m
-0.73
n
-0.71
er
-0.71
iness
-0.70
nt
-0.68
z
-0.65
POSITIVE LOGITS
ſeveral
1.23
Houſe
1.21
myſelf
1.17
Theſe
1.09
―――――
1.08
houſe
1.07
pleaſure
1.06
themſelves
1.04
ſmall
1.04
Reſ
1.03
Activations Density 0.273%