INDEX
Explanations
instances of individuals pretending to be something they are not
instances of people or entities assuming false identities
New Auto-Interp
Negative Logits
sugg
-0.84
many
-0.80
Reviewer
-0.78
roundup
-0.75
disproportion
-0.69
ucket
-0.68
most
-0.65
cumulative
-0.65
raq
-0.64
accum
-0.64
POSITIVE LOGITS
invincible
0.75
Nicarag
0.72
aphael
0.72
clown
0.70
Egyptian
0.70
Hermes
0.69
CGI
0.69
Nak
0.69
harmless
0.68
Martian
0.68
Activations Density 0.177%