INDEX
Explanations
phrases indicating a negative event or action associated with individuals or groups
the occurrence of the verb "were."
New Auto-Interp
Negative Logits
Flavoring
-0.59
emis
-0.55
Merit
-0.53
Seller
-0.51
position
-0.50
itionally
-0.50
2020
-0.50
ËĪ
-0.49
Effective
-0.49
ebin
-0.48
POSITIVE LOGITS
were
2.81
weren
2.50
were
2.49
Were
1.97
Were
1.90
are
1.69
aren
1.48
're
1.39
hadn
1.32
differed
1.24
Activations Density 0.169%