INDEX
Explanations
questions or statements expressing uncertainty or possibility
expressions of uncertainty or questions about future events
New Auto-Interp
Negative Logits
derived
-0.72
arers
-0.70
pecially
-0.70
APH
-0.69
advertisement
-0.69
pione
-0.68
ustomed
-0.67
arest
-0.66
ascript
-0.66
ItemImage
-0.66
POSITIVE LOGITS
he
1.20
she
0.90
Romo
0.88
Dwight
0.86
Jarrett
0.86
Teddy
0.86
Jimmy
0.86
Trey
0.86
they
0.84
AJ
0.83
Activations Density 0.392%