INDEX
Explanations
the word "the" at the beginning of sentences
instances of the word "the"
New Auto-Interp
Negative Logits
vu
-0.72
uala
-0.72
fy
-0.68
UA
-0.67
!,
-0.64
vg
-0.64
vp
-0.64
vana
-0.63
Boo
-0.63
VG
-0.61
POSITIVE LOGITS
course
1.08
weekend
1.07
entirety
0.96
span
0.96
ensuing
0.89
holidays
0.88
whole
0.88
millennia
0.87
latter
0.87
entire
0.86
Activations Density 0.092%