INDEX
Explanations
references to individuals in a narrative context, particularly focusing on their characteristics and relationships
New Auto-Interp
Negative Logits
pta
-0.17
uela
-0.17
Female
-0.15
Female
-0.15
pets
-0.14
íĨ¤
-0.14
sik
-0.14
loor
-0.13
Prostit
-0.13
à¸ı
-0.13
POSITIVE LOGITS
guy
0.71
man
0.64
dude
0.49
fellow
0.47
person
0.47
guys
0.45
chap
0.44
blo
0.41
gentleman
0.39
Guy
0.38
Activations Density 0.319%