INDEX
Explanations
relevant nouns
references to titles, statistics, and numerical data
New Auto-Interp
Negative Logits
Was
-0.84
Was
-0.80
lished
-0.71
expected
-0.71
was
-0.71
Added
-0.68
Said
-0.67
printed
-0.66
WAS
-0.66
pared
-0.65
POSITIVE LOGITS
are
1.55
reside
1.44
comprise
1.34
occupy
1.34
belong
1.32
aren
1.31
constitute
1.28
operate
1.24
resemble
1.24
rely
1.23
Activations Density 0.615%