INDEX
Explanations
mentions of possessive pronouns 'its' and 'his'
New Auto-Interp
Negative Logits
JUST
-0.67
****
-0.64
ILE
-0.61
anza
-0.59
cture
-0.58
Enabled
-0.58
nih
-0.58
***
-0.57
DEN
-0.57
Brow
-0.57
POSITIVE LOGITS
attendant
1.22
consequ
1.16
accompanying
1.14
ensuing
1.10
associated
1.09
subsequent
1.02
surrounding
0.99
corresponding
0.99
resultant
0.98
thereof
0.97
Activations Density 2.311%