INDEX
Explanations
conditional statements or phrases involving "if."
"If" followed by pronouns/determiners
if followed by subject
New Auto-Interp
Negative Logits
Monfieur
-0.86
Efq
-0.79
myſelf
-0.79
itſelf
-0.78
<<<<<<<<<<<<<<
-0.76
헌
-0.73
whoſe
-0.72
شهاد
-0.70
Majefty
-0.70
extAlignment
-0.69
POSITIVE LOGITS
anything
0.77
anyone
0.70
anybody
0.66
anything
0.61
Anybody
0.59
Anyone
0.58
anyone
0.57
hadn
0.56
Anyone
0.55
you
0.55
Activations Density 0.116%