INDEX
Explanations
locations or directional words (e.g., "across," "from") in a sentence
the word "across" in various contexts
New Auto-Interp
Negative Logits
nery
-0.76
etic
-0.70
FORE
-0.69
spot
-0.63
rw
-0.61
ENC
-0.58
nce
-0.57
Parents
-0.57
getic
-0.57
HELP
-0.57
POSITIVE LOGITS
roads
0.83
atform
0.77
rooft
0.74
side
0.72
hang
0.71
flow
0.70
ĸļ
0.67
urst
0.67
paths
0.66
halla
0.66
Activations Density 0.027%