INDEX
Explanations
instances of the word "of."
New Auto-Interp
Negative Logits
lady
-0.16
reau
-0.15
iphy
-0.15
ibar
-0.15
ÏĢε
-0.15
issing
-0.14
ÏĢά
-0.14
stance
-0.13
urai
-0.13
Narr
-0.13
POSITIVE LOGITS
↵↵
0.17
icer
0.15
ipop
0.15
PEC
0.14
lém
0.14
otros
0.14
appa
0.14
readcr
0.14
JC
0.14
953
0.13
Activations Density 0.058%