INDEX
Explanations
sections or sentences identified as abstracts in academic papers
New Auto-Interp
Negative Logits
Par
-0.54
-0.52
<eos>
-0.50
sport
-0.50
Con
-0.49
ch
-0.49
↵
-0.49
Con
-0.48
Al
-0.48
El
-0.48
POSITIVE LOGITS
</thead>
0.98
myſelf
0.94
jspb
0.93
Ass
0.92
MENAFN
0.90
Ass
0.90
../../../
0.90
zzleHttp
0.89
poffible
0.88
auffi
0.87
Activations Density 0.080%