INDEX
Explanations
pronouns and verbs related to self-identity
references to personal identity and self-identification
New Auto-Interp
Negative Logits
temptation
-0.79
aptic
-0.66
pitfalls
-0.64
passage
-0.64
hiba
-0.63
overdose
-0.63
showers
-0.63
onga
-0.62
pmwiki
-0.62
dosage
-0.62
POSITIVE LOGITS
pires
0.79
selves
0.77
adherent
0.74
los
0.73
"@
0.72
"#
0.71
subscrib
0.71
"$:/
0.71
unbeat
0.70
çīĪ
0.69
Activations Density 0.066%