INDEX
Explanations
assertions about visual appeal and imagery
New Auto-Interp
Negative Logits
ìĹŃìĭľ
-0.15
ãģļ
-0.15
cket
-0.15
Yorker
-0.15
yine
-0.14
toujours
-0.14
Obviously
-0.14
esign
-0.14
siempre
-0.14
surtout
-0.14
POSITIVE LOGITS
border
0.30
practically
0.29
almost
0.29
rival
0.28
literally
0.27
borders
0.27
nearly
0.27
border
0.27
almost
0.25
borderline
0.25
Activations Density 0.246%