INDEX
Explanations
phrases related to interpersonal relationships
references to interpersonal relationships and actions involving "someone."
New Auto-Interp
Negative Logits
ories
-0.82
DOS
-0.74
tnc
-0.67
veyard
-0.67
osterone
-0.66
enegger
-0.63
heny
-0.62
letters
-0.62
HL
-0.59
Plex
-0.58
POSITIVE LOGITS
else
2.10
else
1.43
Else
1.31
Else
1.23
who
1.08
whom
0.93
whose
0.88
who
0.85
's
0.79
EL
0.73
Activations Density 0.069%