INDEX
Explanations
phrases related to the potential or capacity of actions or consequences
modal verbs expressing ability or potential
New Auto-Interp
Negative Logits
Seeking
-0.65
bidding
-0.65
Fighter
-0.64
Fighting
-0.63
Mant
-0.61
Ivory
-0.60
din
-0.60
submitting
-0.59
honoring
-0.58
bats
-0.58
POSITIVE LOGITS
't
1.59
adian
1.20
berra
1.18
isters
1.12
NOT
1.08
easily
1.06
ister
1.03
be
1.01
afford
0.97
confuse
0.94
Activations Density 0.149%