INDEX
Explanations
comparative phrases indicating increasing intensity or correlation between actions
phrases indicating increasing intensity or complexity
New Auto-Interp
Negative Logits
é¾įå
-0.87
çīĪ
-0.76
poke
-0.73
oths
-0.69
ospons
-0.66
unic
-0.66
rities
-0.64
esides
-0.64
IVERS
-0.62
john
-0.62
POSITIVE LOGITS
chance
0.84
chances
0.76
likely
0.73
tendency
0.70
likelihood
0.68
raq
0.68
probability
0.67
CHA
0.66
temptation
0.65
havoc
0.62
Activations Density 0.062%