INDEX
Explanations
references to specific entities or individuals
words associated with attributing information or statements
references to specific people or events
New Auto-Interp
Negative Logits
Vance
-0.73
adherents
-0.69
McInt
-0.68
rored
-0.66
mathemat
-0.65
Seymour
-0.65
Suff
-0.64
pse
-0.64
Rosenberg
-0.63
bably
-0.62
POSITIVE LOGITS
sburg
0.81
ebook
0.77
earth
0.75
elight
0.73
brates
0.71
Reviewer
0.70
"}],"
0.66
176
0.64
iculty
0.64
hunt
0.63
Activations Density 0.000%