INDEX
Explanations
references to personal life events, actions, and relationships
conjunctions and phrases indicating relationships and personal affiliations
New Auto-Interp
Negative Logits
Were
-0.75
URN
-0.72
Ended
-0.72
Then
-0.68
apter
-0.68
ÙIJ
-0.67
urations
-0.65
rived
-0.64
RAL
-0.64
âĶĢâĶĢâĶĢâĶĢ
-0.63
POSITIVE LOGITS
insists
1.47
intends
1.44
accuses
1.41
refuses
1.37
believes
1.34
prefers
1.32
expects
1.29
enjoys
1.28
maintains
1.28
opposes
1.28
Activations Density 0.744%