INDEX
Explanations
instances of the letter 'A' in various contexts
New Auto-Interp
Negative Logits
lee
-0.15
inya
-0.15
ipo
-0.15
γο
-0.14
oto
-0.14
fri
-0.14
eldon
-0.14
tic
-0.14
ss
-0.14
153
-0.14
POSITIVE LOGITS
emma
0.17
biz
0.16
elman
0.15
REW
0.15
uxtap
0.15
emey
0.15
.AR
0.14
ATIO
0.14
audi
0.14
ATAB
0.14
Activations Density 0.005%