INDEX
Explanations
numerical references and identifiers in academic citations
New Auto-Interp
Negative Logits
tremend
-0.60
rongh
-0.59
ongh
-0.57
prem
-0.56
rium
-0.55
efully
-0.54
ilts
-0.54
rika
-0.53
reek
-0.53
hetical
-0.53
POSITIVE LOGITS
Merit
0.68
Magicka
0.61
ã쮿
0.59
Papers
0.58
Starcraft
0.58
GMT
0.56
Journals
0.56
çİĭ
0.56
éĽ
0.55
ayer
0.55
Activations Density 0.034%