INDEX
Explanations
questions that begin with "Did" or "did" followed by a pronoun
New Auto-Interp
Negative Logits
sto
-0.16
ÏĦÏī
-0.16
izer
-0.15
indre
-0.15
ozy
-0.14
quali
-0.14
illin
-0.14
_HC
-0.14
idor
-0.13
ähr
-0.13
POSITIVE LOGITS
bah
0.14
ijken
0.14
ponge
0.14
ijo
0.14
raise
0.14
afort
0.14
allery
0.14
Bij
0.13
flen
0.13
UTE
0.13
Activations Density 0.041%