INDEX
Explanations
the number "four" in various contexts
occurrences of the word "four"
New Auto-Interp
Negative Logits
idence
-0.68
Happ
-0.66
Fed
-0.66
UGE
-0.65
Rica
-0.64
isky
-0.63
UTION
-0.63
ller
-0.61
reference
-0.61
LER
-0.60
POSITIVE LOGITS
teenth
1.87
teen
1.86
hundred
1.12
eenth
1.11
some
1.10
een
1.06
aciously
1.03
square
1.00
acious
0.92
fif
0.92
Activations Density 0.049%