INDEX
Explanations
personal nouns referring to people or groups of people
mentions of "lives" and their improvement or significance in various contexts
New Auto-Interp
Negative Logits
NES
-0.79
AMI
-0.74
ority
-0.70
Null
-0.69
iban
-0.68
Rat
-0.64
neutrality
-0.63
CAST
-0.63
steamapps
-0.62
Syndicate
-0.62
POSITIVE LOGITS
chool
1.17
cape
1.04
pring
0.96
ynthesis
0.93
paces
0.90
journal
0.85
erver
0.84
pace
0.83
guard
0.83
lihood
0.81
Activations Density 0.026%