INDEX
Explanations
instances of the word "first"
instances of the word "first" in various contexts
New Auto-Interp
Negative Logits
Gould
-0.82
Canaver
-0.71
morph
-0.70
holes
-0.68
mbuds
-0.66
Nadu
-0.66
ueller
-0.66
vor
-0.64
utic
-0.64
tics
-0.63
POSITIVE LOGITS
baseman
1.13
responders
1.10
glance
0.90
impression
0.82
lady
0.81
impressions
0.77
glim
0.75
ancest
0.75
blush
0.74
step
0.72
Activations Density 0.072%