INDEX
Explanations
conditional statements suggesting potential outcomes
conditional phrases expressing hypothetical outcomes
New Auto-Interp
Negative Logits
Named
-0.68
Writing
-0.64
Chal
-0.61
Returning
-0.59
dwelling
-0.59
Trap
-0.58
charged
-0.57
Xuan
-0.57
sho
-0.57
Moving
-0.56
POSITIVE LOGITS
be
0.98
ideally
0.96
undoubtedly
0.96
surely
0.95
imply
0.94
suffice
0.94
probably
0.93
entail
0.92
doubtless
0.91
allow
0.90
Activations Density 0.148%