INDEX
Explanations
specific actions or strategies indicated by action verbs such as "moves"
instances of the word "moves" in various contexts
New Auto-Interp
Negative Logits
Alpine
-0.68
oys
-0.66
OA
-0.66
Gateway
-0.63
Koran
-0.62
thia
-0.61
sson
-0.60
Barton
-0.59
lain
-0.58
Alaska
-0.58
POSITIVE LOGITS
peed
1.08
ivism
0.82
itters
0.80
ets
0.79
moves
0.78
horizont
0.77
HUD
0.76
itic
0.75
brates
0.75
hops
0.74
Activations Density 0.008%