INDEX
    Explanations

    concepts related to uniqueness and individual qualities

    New Auto-Interp
    Negative Logits
    ä¸ĢåĪĩ
    -0.20
    avail
    -0.17
     all
    -0.16
     semua
    -0.15
    одав
    -0.15
    urai
    -0.15
     everything
    -0.15
    änge
    -0.15
    swick
    -0.14
    ัย
    -0.14
    POSITIVE LOGITS
     unique
    0.27
     different
    0.26
    ä¸įåIJĮçļĦ
    0.26
     differently
    0.26
     respective
    0.24
    unique
    0.24
    different
    0.23
     diferente
    0.23
    .unique
    0.23
     respectively
    0.22
    Act Density 0.211%

    No Known Activations