INDEX
Explanations
phrases indicating ease of completing tasks or processes
New Auto-Interp
Negative Logits
ÑģÑĤв
-0.17
usty
-0.15
conserv
-0.15
k
-0.15
bigger
-0.14
branch
-0.14
needed
-0.14
to
-0.14
Routes
-0.14
_MODAL
-0.14
POSITIVE LOGITS
easier
0.30
ease
0.27
Ease
0.25
_easy
0.21
Ease
0.21
easy
0.20
easiest
0.20
easy
0.20
ease
0.20
eas
0.19
Activations Density 0.073%