INDEX
    Explanations

    numerical values or quantities referenced in the text

    New Auto-Interp
    Negative Logits
    ;
    -0.57
    :
    -0.57
     nahilalakip
    -0.53
    ,
    -0.52
     the
    -0.51
    -0.49
    .
    -0.48
    XmlAccessorType
    -0.48
    ),
    -0.47
     bad
    -0.47
    POSITIVE LOGITS
     myſelf
    0.85
     raiſ
    0.85
     itſelf
    0.82
     kasarigan
    0.81
     iſt
    0.79
     ſeveral
    0.79
     reaſon
    0.77
     deſt
    0.76
     ſever
    0.75
     ſta
    0.74
    Act Density 0.533%

    No Known Activations