INDEX
Explanations
the word "where"
questions or statements inquiring about a specific location
New Auto-Interp
Negative Logits
yi
-0.81
ATURE
-0.78
asts
-0.74
ilus
-0.69
agers
-0.69
ME
-0.66
vous
-0.66
astics
-0.66
emed
-0.64
ems
-0.64
POSITIVE LOGITS
abouts
1.34
upon
1.12
fore
1.03
soever
0.82
else
0.80
ver
0.71
exactly
0.69
velt
0.67
with
0.65
Dat
0.65
Activations Density 0.049%