INDEX
Explanations
first-person and third-person pronouns in reflective contexts
New Auto-Interp
Negative Logits
Millennium
-0.54
Mandatory
-0.53
Episode
-0.51
Nationwide
-0.50
Unlimited
-0.48
Round
-0.47
Reading
-0.46
Measure
-0.46
assembly
-0.46
Around
-0.45
POSITIVE LOGITS
'll
1.08
'd
1.06
've
0.98
didn
0.89
forgot
0.87
're
0.86
swore
0.85
hadn
0.84
didnt
0.83
knew
0.83
Activations Density 0.405%