INDEX
Explanations
references to information mentioned or said earlier in a text
phrases that refer to prior statements or mentions
New Auto-Interp
Negative Logits
istries
-0.85
fu
-0.80
Made
-0.79
mens
-0.71
istered
-0.70
bled
-0.69
mop
-0.67
wear
-0.67
ores
-0.67
stru
-0.66
POSITIVE LOGITS
earlier
1.18
previously
0.94
above
0.88
previous
0.75
deduction
0.73
acknow
0.73
aforementioned
0.72
mentioned
0.72
yesterday
0.70
hitherto
0.70
Activations Density 0.087%