INDEX
Explanations
personal statements and affirmations
first-person singular pronouns and expressions of self-identity
New Auto-Interp
Negative Logits
antitrust
-0.63
ses
-0.63
icist
-0.62
Millennium
-0.62
Gad
-0.59
Annotations
-0.56
TPS
-0.56
apiece
-0.56
Straw
-0.56
earch
-0.56
POSITIVE LOGITS
'm
1.43
've
1.22
'll
1.13
am
1.04
owe
1.02
'd
1.00
ronic
0.92
deserve
0.90
adore
0.90
hereby
0.90
Activations Density 0.297%