INDEX
Explanations
phrases indicating an action that has not yet occurred or been completed
New Auto-Interp
Negative Logits
lies
-0.80
Species
-0.68
ths
-0.66
çīĪ
-0.65
Needs
-0.64
Must
-0.63
now
-0.62
Days
-0.62
still
-0.61
probably
-0.58
POSITIVE LOGITS
adequately
1.05
bothered
0.96
satisf
0.93
properly
0.93
harmed
0.92
icable
0.90
epad
0.89
anywhere
0.89
formally
0.88
prosecuted
0.88
Activations Density 0.074%