INDEX
Explanations
the word "Nice"
instances of the words "Nice" and "Grab."
New Auto-Interp
Negative Logits
timely
-0.78
ocene
-0.78
Appalachian
-0.69
onics
-0.65
opposing
-0.63
ript
-0.63
psi
-0.62
Jem
-0.61
andestine
-0.61
itudinal
-0.60
POSITIVE LOGITS
Nice
2.35
Grab
1.89
Grab
1.64
Nice
1.57
grab
1.45
flow
1.37
Braun
1.18
Cannes
1.18
Bild
1.16
Lug
1.13
Activations Density 0.052%