INDEX
    Explanations

    terms related to abstract and concrete concepts in various contexts

    New Auto-Interp
    Negative Logits
    unta
    -0.17
    acha
    -0.16
    upe
    -0.15
    istrovstvÃŃ
    -0.15
    ÑĥÑĩа
    -0.15
    rag
    -0.15
    è¾°
    -0.14
    olla
    -0.14
    ot
    -0.14
    orus
    -0.14
    POSITIVE LOGITS
    ed
    0.23
    ivism
    0.18
    ified
    0.18
    itious
    0.18
    edly
    0.17
    edImage
    0.17
    angelo
    0.16
    iment
    0.15
    iction
    0.15
     jungle
    0.15
    Act Density 0.016%

    No Known Activations