INDEX
Explanations
instances of returning or return actions
instances of the word "return" and its variations
New Auto-Interp
Negative Logits
ussen
-0.74
boss
-0.64
creen
-0.64
Cola
-0.64
bis
-0.62
mology
-0.61
astics
-0.60
vern
-0.60
ahi
-0.60
stood
-0.60
POSITIVE LOGITS
home
0.94
HOME
0.84
home
0.80
ees
0.80
nil
0.73
postage
0.70
safely
0.70
owship
0.68
alive
0.68
pring
0.67
Activations Density 0.035%