INDEX
    Explanations

    concepts or adjectives that describe inherent qualities or attributes

    New Auto-Interp
    Negative Logits
    yk
    -0.17
     descargar
    -0.15
    _LITERAL
    -0.15
    _ASM
    -0.15
    âm
    -0.15
    ourt
    -0.14
    orr
    -0.14
    воз
    -0.14
    iso
    -0.14
    ione
    -0.14
    POSITIVE LOGITS
     naturally
    0.28
     Naturally
    0.20
     natural
    0.20
    /default
    0.19
     Natural
    0.18
    Natural
    0.18
     Automatically
    0.16
    /native
    0.16
     automatically
    0.16
     instinct
    0.15
    Act Density 0.071%

    No Known Activations