INDEX
Explanations
descriptors that convey admiration or aesthetic appreciation
New Auto-Interp
Negative Logits
riteria
-0.14
ervas
-0.14
ythe
-0.14
elon
-0.13
.FontStyle
-0.13
shaw
-0.13
resh
-0.13
ouver
-0.13
attenu
-0.13
owa
-0.13
POSITIVE LOGITS
Hed
0.21
oneself
0.18
hed
0.17
itself
0.16
cha
0.16
.gdx
0.15
presentation
0.15
themselves
0.15
captures
0.15
thems
0.14
Activations Density 0.000%