INDEX
Explanations
references to specific scientific data, measurements, or results
New Auto-Interp
Negative Logits
↵
-0.48
'
-0.46
<eos>
-0.46
-0.45
dą
-0.42
«
-0.42
Bit
-0.41
2
-0.41
ma
-0.40
-0.40
POSITIVE LOGITS
myſelf
0.99
RegressionTest
0.97
raiſ
0.93
'\\;'
0.92
itſelf
0.91
―――――
0.89
addPreferredGap
0.88
^(@)
0.87
ſeveral
0.87
themſelves
0.86
Activations Density 0.957%