INDEX
Explanations
the occurrence of the word "including."
New Auto-Interp
Negative Logits
ai
-0.78
acket
-0.76
aya
-0.75
esa
-0.72
cel
-0.70
gor
-0.70
peg
-0.69
erv
-0.69
assian
-0.68
agne
-0.68
POSITIVE LOGITS
ones
0.91
yours
0.91
ours
0.88
those
0.87
hers
0.84
mine
0.75
possibly
0.75
vice
0.72
theirs
0.69
incidentally
0.64
Activations Density 0.118%