INDEX
Explanations
adjectives expressing intensity or importance
adjectives and descriptors indicating magnitude or quality
New Auto-Interp
Negative Logits
warp
-0.63
Berks
-0.62
Leilan
-0.62
Wer
-0.60
Ley
-0.60
Rath
-0.59
Warp
-0.59
mans
-0.59
DM
-0.59
Kore
-0.59
POSITIVE LOGITS
theless
1.38
terday
1.24
tenance
1.13
etheless
1.03
withstanding
0.96
selves
0.93
usterity
0.92
mosp
0.88
acters
0.85
lihood
0.85
Activations Density 0.249%