INDEX
Explanations
variations of the word "able," indicating a focus on capability or capacity
New Auto-Interp
Negative Logits
<bos>
-0.76
your
-0.52
the
-0.46
Schwartz
-0.44
Herbst
-0.41
Erickson
-0.40
its
-0.40
alongside
-0.40
these
-0.39
בח
-0.39
POSITIVE LOGITS
able
1.24
Able
1.10
Able
1.04
unable
0.90
unable
0.89
Unable
0.89
Unable
0.85
Ability
0.84
capaces
0.84
capaz
0.83
Activations Density 0.010%