INDEX
    Explanations

    science fields

    New Auto-Interp
    Negative Logits
     myſelf
    -1.27
     pleaſure
    -1.21
     itſelf
    -1.15
    <bos>
    -1.14
     '\\;'
    -1.11
     Efq
    -1.10
     themſelves
    -1.09
     raiſ
    -1.09
     Monfieur
    -1.09
     Paglinawan
    -1.09
    POSITIVE LOGITS
    0.70
    :
    0.62
    .
    0.60
     of
    0.57
    ,
    0.56
     (
    0.52
     T
    0.50
     A
    0.50
    ch
    0.49
     O
    0.48
    Act Density 0.207%

    No Known Activations