INDEX
Explanations
phrases or references related to "troopers."
New Auto-Interp
Negative Logits
lef
-0.18
ums
-0.17
urnal
-0.16
nge
-0.15
zioni
-0.15
rý
-0.15
mits
-0.15
lam
-0.15
rov
-0.15
urm
-0.15
POSITIVE LOGITS
dden
0.28
ppo
0.23
opers
0.23
UBLE
0.22
oper
0.21
tro
0.21
isi
0.20
Tro
0.20
oping
0.20
Tro
0.19
Activations Density 0.010%