INDEX
Explanations
dates and proper names in a structured format
sequences of numerical data or percentages in discussions
New Auto-Interp
Negative Logits
antioxid
-0.72
Magikarp
-0.71
exha
-0.68
adolesc
-0.68
76561
-0.66
.""
-0.63
Balt
-0.62
DAQ
-0.62
undermin
-0.59
corrid
-0.59
POSITIVE LOGITS
↵
1.68
<|endoftext|>
1.21
↵Âł
1.11
↵↵
1.09
While
0.79
These
0.77
This
0.77
However
0.75
Edit
0.75
Also
0.74
Activations Density 1.815%