INDEX
Explanations
phrases indicating comparison or contrast
phrases that include the word "let" followed by related phrases that indicate limitations or exclusions
New Auto-Interp
Negative Logits
cill
-0.73
acci
-0.64
ottest
-0.62
esi
-0.62
ombat
-0.62
encount
-0.62
assian
-0.59
ird
-0.56
ole
-0.55
idian
-0.55
POSITIVE LOGITS
alone
1.54
tered
0.85
Alone
0.83
ting
0.83
tering
0.77
aside
0.75
downs
0.73
ingly
0.68
loose
0.68
us
0.67
Activations Density 0.015%