INDEX
Explanations
instances of the word "here."
New Auto-Interp
Negative Logits
rine
-0.15
minster
-0.15
ri
-0.15
roc
-0.15
anco
-0.15
tryside
-0.14
Woodward
-0.14
otate
-0.14
UNCH
-0.14
shot
-0.14
POSITIVE LOGITS
æ±
0.20
after
0.17
ault
0.17
inkel
0.17
ίνα
0.15
fore
0.15
vation
0.15
ection
0.14
kening
0.14
ina
0.14
Activations Density 0.055%