INDEX
Explanations
the word "as" in various contexts
New Auto-Interp
Negative Logits
æĸ
-0.64
ries
-0.59
roads
-0.58
rior
-0.57
artifacts
-0.57
aster
-0.55
agos
-0.55
Sharp
-0.54
rous
-0.54
Psy
-0.54
POSITIVE LOGITS
pired
1.22
pires
1.15
phalt
1.11
pects
1.06
bestos
1.06
cription
1.04
semb
1.02
pire
0.99
pect
0.99
semble
0.98
Activations Density 0.115%