INDEX
Explanations
demonstrative pronouns and their connections to context
New Auto-Interp
Negative Logits
erm
-0.19
above
-0.16
Wright
-0.15
above
-0.15
enci
-0.15
ischen
-0.14
rap
-0.14
vice
-0.14
akh
-0.14
asi
-0.14
POSITIVE LOGITS
iž
0.17
ROWSER
0.15
esser
0.14
ê·¸ê²ĥ
0.14
atz
0.14
CppObject
0.14
nist
0.14
ENTITY
0.14
onu
0.13
Dispatch
0.13
Activations Density 0.177%