INDEX
    Explanations

    references to individuality or personal contributions

    New Auto-Interp
    Negative Logits
    inha
    -0.16
    odian
    -0.15
    AGER
    -0.15
    ÑĥÑĢÑĥ
    -0.15
    few
    -0.15
    upy
    -0.15
    ico
    -0.15
    kova
    -0.14
     Parr
    -0.14
    ogh
    -0.14
    POSITIVE LOGITS
     individual
    0.27
     Individual
    0.26
    individual
    0.23
    Individual
    0.21
     åĢĭ
    0.20
    _individual
    0.18
     individ
    0.18
    个
    0.18
    /single
    0.18
    åĢĭ
    0.18
    Act Density 0.102%

    No Known Activations