INDEX
Explanations
instances of the word "welcome" or its variations
New Auto-Interp
Negative Logits
t
-0.16
yen
-0.15
cuff
-0.15
Gould
-0.15
eced
-0.15
eff
-0.15
ortion
-0.15
ythe
-0.15
lej
-0.14
quia
-0.14
POSITIVE LOGITS
coming
0.24
comed
0.21
les
0.19
hausen
0.19
nesday
0.18
wyn
0.18
summer
0.17
.SizeMode
0.17
comes
0.17
Wel
0.17
Activations Density 0.005%