INDEX
Explanations
instances of words related to running or moving quickly
instances of the words "run" and "ran."
New Auto-Interp
Negative Logits
Lauder
-0.68
repre
-0.66
theless
-0.63
alam
-0.60
suscept
-0.59
olia
-0.58
iasco
-0.58
defic
-0.58
ortium
-0.57
subsequ
-0.57
POSITIVE LOGITS
escape
1.01
aways
0.97
gs
0.93
af
0.89
away
0.85
nin
0.83
agate
0.81
dy
0.80
rampant
0.80
around
0.79
Activations Density 0.049%