INDEX
Explanations
phrases indicating lack of completion or action
the phrase "haven't" and its variants in various contexts
New Auto-Interp
Negative Logits
dress
-0.69
otype
-0.67
upp
-0.67
urat
-0.67
rophe
-0.59
yles
-0.58
mens
-0.57
guid
-0.57
Ballistic
-0.56
angering
-0.56
POSITIVE LOGITS
't
1.09
ited
0.89
ÃŃ
0.86
Been
0.84
tyard
0.83
geon
0.82
anu
0.81
ģĸ
0.80
dayName
0.79
eteen
0.79
Activations Density 0.015%