INDEX
Explanations
sentence endings and separators
New Auto-Interp
Negative Logits
उदा
0.79
eg
0.70
第一個
0.69
deleting
0.68
less
0.68
magic
0.68
volunteering
0.68
homework
0.66
big
0.65
equality
0.65
POSITIVE LOGITS
etera
1.23
.),
1.20
.).
1.17
.;
1.07
.],
1.01
.?
1.00
.—
0.99
."),
0.97
.');
0.96
.].
0.95
Activations Density 0.114%