INDEX
Explanations
sentences with pronouns referring to specific individuals or groups
references to groups of people and individuals involved in actions or events
New Auto-Interp
Negative Logits
Eleven
-0.69
GMT
-0.65
gate
-0.62
wake
-0.62
geries
-0.61
wick
-0.60
itu
-0.59
violent
-0.58
population
-0.58
Um
-0.57
POSITIVE LOGITS
drew
0.86
wrote
0.86
hoped
0.83
also
0.83
'll
0.82
did
0.80
ensured
0.79
've
0.79
gave
0.78
recommends
0.78
Activations Density 0.449%