INDEX
    Explanations

    names and references to individuals or entities in the text

    New Auto-Interp
    Negative Logits
    zelf
    -0.19
    ijo
    -0.18
    UFFIX
    -0.16
    erm
    -0.16
    ities
    -0.16
    INGS
    -0.15
     Agility
    -0.15
    ovice
    -0.15
    bins
    -0.15
    roud
    -0.15
    POSITIVE LOGITS
    icut
    0.19
    ion
    0.18
    atic
    0.17
    icult
    0.16
    al
    0.16
     từng
    0.16
    оÑģÑĮ
    0.16
    be
    0.16
    /ref
    0.15
    ÙĨاÙħÙĩ
    0.15
    Act Density 0.380%

    No Known Activations