INDEX
Explanations
punctuation marks and periods indicating sentence endings
New Auto-Interp
Negative Logits
juſ
-0.86
Билгалдахарш
-0.79
desmotivaciones
-0.76
myſelf
-0.75
ſou
-0.75
faſt
-0.75
viſ
-0.75
auffi
-0.74
itſelf
-0.73
ſever
-0.72
POSITIVE LOGITS
.
0.93
}$.
0.69
$.
0.63
.
0.62
].
0.59
}.
0.57
).
0.57
|.
0.56
.'.
0.56
?.
0.54
Activations Density 1.524%