INDEX
    Explanations

    terms related to measurement and classification in various contexts

    New Auto-Interp
    Negative Logits
    s
    -0.20
    hook
    -0.20
    ร
    -0.19
    ings
    -0.19
    ing
    -0.18
    haf
    -0.17
    sar
    -0.17
    onet
    -0.17
    iest
    -0.17
    ein
    -0.17
    POSITIVE LOGITS
    ALLY
    0.60
    ally
    0.55
    ity
    0.37
    amente
    0.32
    all
    0.30
    ITY
    0.29
    ated
    0.26
    ians
    0.26
    ities
    0.25
    us
    0.24
    Act Density 0.231%

    No Known Activations