INDEX
    Explanations

    terms related to size or comparison in context

    New Auto-Interp
    Negative Logits
     T
    -0.71
     mit
    -0.69
    -0.66
     Mat
    -0.63
     Z
    -0.61
     j
    -0.61
     k
    -0.61
     Mil
    -0.60
     f
    -0.60
     Bad
    -0.58
    POSITIVE LOGITS
     leſs
    1.26
     myſelf
    1.25
     itſelf
    1.17
     himſelf
    1.16
    theless
    1.15
    ſelf
    1.14
     Anſ
    1.11
     themſelves
    1.06
     reaſon
    1.06
    wiſe
    1.06
    Act Density 0.124%

    No Known Activations