INDEX
Explanations
similes or comparisons
New Auto-Interp
Negative Logits
erity
-0.85
iencies
-0.79
furthermore
-0.76
moreover
-0.75
icals
-0.75
autions
-0.73
yd
-0.71
asio
-0.71
hani
-0.69
idates
-0.68
POSITIVE LOGITS
crap
0.81
spaghetti
0.81
sponge
0.80
Frankenstein
0.77
nightmare
0.76
pi
0.76
lifeless
0.75
encyclopedia
0.74
shit
0.72
miniature
0.71
Activations Density 2.291%