INDEX
Explanations
words related to improvement and progress
phrases related to motivation and teamwork
New Auto-Interp
Negative Logits
reply
-0.66
mysteriously
-0.65
aired
-0.64
ORY
-0.62
uttered
-0.61
unknown
-0.61
ndra
-0.61
oshenko
-0.61
ATURE
-0.60
inexpl
-0.60
POSITIVE LOGITS
ourselves
2.02
our
1.50
them
1.04
myself
1.02
ours
1.01
everybody
1.00
OUR
0.91
Our
0.91
accordingly
0.88
everything
0.84
Activations Density 0.505%