INDEX
Explanations
personal pronouns and past tense verbs related to actions
pronouns, particularly focusing on characters and their actions or states
New Auto-Interp
Negative Logits
Reviewer
-0.83
ogue
-0.73
arthed
-0.73
catentry
-0.72
aunch
-0.67
uesday
-0.66
opolis
-0.63
ruciating
-0.62
retty
-0.62
Globe
-0.61
POSITIVE LOGITS
'll
1.34
'd
1.17
're
0.98
ought
0.94
could
0.94
SHOULD
0.92
've
0.91
misunder
0.91
forgot
0.90
would
0.89
Activations Density 0.220%