INDEX
Explanations
numerical values at the beginning of specific patterns
the number four in various contexts
New Auto-Interp
Negative Logits
venue
-0.71
Cheong
-0.69
vier
-0.69
Whitman
-0.67
iday
-0.65
esville
-0.64
Winchester
-0.64
becca
-0.62
woo
-0.61
arial
-0.61
POSITIVE LOGITS
teenth
1.28
teen
1.22
eva
1.06
hyde
0.94
some
0.87
Kids
0.85
amaz
0.85
cyl
0.83
Chan
0.82
66666666
0.81
Activations Density 0.100%