INDEX
Explanations
instances where someone is pretending or plays a role
expressions related to pretending and personal identity
New Auto-Interp
Negative Logits
owder
-0.71
arton
-0.67
ousand
-0.67
arian
-0.67
cember
-0.66
zb
-0.65
mentioned
-0.65
olla
-0.64
atl
-0.64
"]=>
-0.64
POSITIVE LOGITS
invincible
1.13
infall
0.96
superiority
0.87
inferior
0.87
unbeat
0.83
omnip
0.76
immutable
0.76
innocence
0.74
nonexistent
0.74
benevolent
0.73
Activations Density 0.332%