INDEX
Explanations
adjectives used to describe characteristics or qualities
adjectives and descriptors related to characteristics and states
New Auto-Interp
Negative Logits
earchers
-0.68
proven
-0.63
ittees
-0.61
Heard
-0.59
reon
-0.58
adr
-0.58
çīĪ
-0.58
Lost
-0.58
hig
-0.57
wan
-0.56
POSITIVE LOGITS
ness
1.63
NESS
1.28
nesses
1.25
ity
1.19
ly
1.01
liness
0.96
ities
0.92
ones
0.89
est
0.84
versus
0.82
Activations Density 0.235%