INDEX
Explanations
the number four in various contexts
the repeated mention of the number four
New Auto-Interp
Negative Logits
idence
-0.74
Rica
-0.66
UGE
-0.66
Fed
-0.65
Happ
-0.64
ller
-0.63
isky
-0.63
utable
-0.62
Collider
-0.62
reference
-0.61
POSITIVE LOGITS
teenth
1.80
teen
1.77
hundred
1.14
een
1.09
some
1.08
eenth
1.08
aciously
1.04
months
0.96
fif
0.96
square
0.96
Activations Density 0.036%