INDEX
    Explanations

    references to measurement or metrics related to performance or evaluation

    New Auto-Interp
    Negative Logits
    mathrm
    -0.69
    stretchr
    -0.68
    <eos>
    -0.67
     and
    -0.61
     saites
    -0.60
     […]
    -0.60
    </em>
    -0.59
     mathvariant
    -0.57
     …
    -0.55
     @"/
    -0.54
    POSITIVE LOGITS
     pleaſure
    0.94
     raiſ
    0.94
     poffe
    0.88
     purpoſe
    0.87
     étoient
    0.85
     houſe
    0.84
    出版年
    0.83
     auffi
    0.80
     Efq
    0.79
     feroit
    0.78
    Act Density 0.470%

    No Known Activations