INDEX
Explanations
years in the 2000s mentioned in references
references to publication years in academic citations
New Auto-Interp
Negative Logits
imer
-0.73
pee
-0.71
mir
-0.70
pees
-0.64
unal
-0.64
Bay
-0.64
isec
-0.63
ype
-0.63
ety
-0.63
venge
-0.63
POSITIVE LOGITS
å¹
1.08
Edition
0.72
EDITION
0.70
é¾
0.70
Diversity
0.69
Highlights
0.69
doctoral
0.69
-)
0.68
)—
0.67
UX
0.65
Activations Density 0.067%