INDEX
Explanations
expressions involving expectations or hypothetical situations
New Auto-Interp
Negative Logits
Sandwich
-0.60
ammy
-0.60
revived
-0.59
demolished
-0.58
stood
-0.57
Cance
-0.57
sei
-0.56
reopened
-0.55
usha
-0.55
isen
-0.55
POSITIVE LOGITS
expect
1.17
normally
1.15
ordinarily
1.09
otherwise
1.08
imagine
0.92
characterize
0.92
anticipate
0.85
tolerate
0.85
deem
0.84
prefer
0.84
Activations Density 0.104%