INDEX
Explanations
phrases indicating authorship or previous mention in a written piece
first-person personal pronouns and statements
New Auto-Interp
Negative Logits
////
-0.73
Choice
-0.66
UCT
-0.65
OUP
-0.63
jection
-0.62
Favorite
-0.62
orses
-0.62
Requ
-0.62
Journal
-0.62
letters
-0.61
POSITIVE LOGITS
mentioned
0.94
alluded
0.88
progressed
0.87
stated
0.85
explained
0.85
noted
0.84
pointed
0.84
progresses
0.84
discussed
0.79
recounted
0.78
Activations Density 0.056%