INDEX
Explanations
pronouns and identity-related words
references to personal identity and relationships
New Auto-Interp
Negative Logits
assembly
-0.58
Round
-0.57
Shelby
-0.56
entary
-0.55
atory
-0.54
Associated
-0.53
atel
-0.52
Gene
-0.52
Nationwide
-0.52
Millennium
-0.51
POSITIVE LOGITS
'll
1.06
've
1.04
'd
0.98
're
0.89
owe
0.84
reluct
0.81
encount
0.80
knew
0.78
ATHER
0.75
swore
0.71
Activations Density 0.599%