INDEX
Explanations
terms related to personal relationships and emotional experiences
Follows a personal pronoun and "had"
pronouns followed by past tense verbs
New Auto-Interp
Negative Logits
furent
-0.86
firent
-0.82
went
-0.82
took
-0.78
furono
-0.77
Went
-0.76
became
-0.76
fue
-0.75
went
-0.74
Became
-0.73
POSITIVE LOGITS
hadn
1.10
would
1.02
would
0.86
habían
0.84
Would
0.81
Would
0.79
had
0.79
WOULD
0.78
había
0.73
had
0.73
Activations Density 0.750%