INDEX
    Explanations

    references to scales and measurements

    New Auto-Interp
    Negative Logits
    zelf
    -0.22
    est
    -0.17
    assis
    -0.17
    ernals
    -0.16
    sell
    -0.16
    urer
    -0.16
    iates
    -0.15
    unker
    -0.15
    sWith
    -0.15
    ries
    -0.15
    POSITIVE LOGITS
    -down
    0.22
    ToFit
    0.20
    -up
    0.20
    azy
    0.17
    out
    0.17
    ardy
    0.17
    -out
    0.16
    able
    0.16
    andin
    0.15
    à¤Ĥड
    0.15
    Act Density 0.016%

    No Known Activations