INDEX
    Explanations

    phrases related to direct and indirect actions or contributions

    New Auto-Interp
    Negative Logits
    atics
    -0.18
    irtual
    -0.17
    ocs
    -0.16
    ะ
    -0.14
     Ñĩин
    -0.14
    .synthetic
    -0.14
    ajor
    -0.14
     entire
    -0.14
    eming
    -0.13
    readcr
    -0.13
    POSITIVE LOGITS
    idad
    0.16
    amente
    0.16
    .Direct
    0.16
    ives
    0.16
    -direct
    0.16
    aneously
    0.16
    ivity
    0.16
     direct
    0.16
     olarak
    0.14
    bote
    0.14
    Act Density 0.031%

    No Known Activations