INDEX
Explanations
references to interpersonal relationships and emotional reactions
New Auto-Interp
Negative Logits
oni
-0.17
ienes
-0.15
loat
-0.14
persona
-0.14
oned
-0.14
rito
-0.14
iej
-0.13
bone
-0.13
zza
-0.13
{}{↵-0.13
POSITIVE LOGITS
andler
0.16
ced
0.15
ãģ¾ãģ¾
0.15
croft
0.14
ding
0.14
abela
0.14
JOB
0.14
extern
0.14
pid
0.14
862
0.14
Activations Density 0.332%