INDEX
Explanations
instances of the word "away."
New Auto-Interp
Negative Logits
Briggs
-0.68
plurality
-0.64
umn
-0.64
chini
-0.64
ingham
-0.62
accompan
-0.61
million
-0.60
chang
-0.60
tone
-0.60
haircut
-0.59
POSITIVE LOGITS
from
0.79
altogether
0.75
agy
0.74
entirely
0.71
ãĤĵ
0.71
sites
0.71
FROM
0.70
Favorite
0.67
lest
0.67
zx
0.65
Activations Density 0.011%