INDEX
    Explanations

    relational words and family dynamics

    New Auto-Interp
    Negative Logits
    aint
    -0.17
    ollen
    -0.16
    ampp
    -0.16
    pong
    -0.15
    ilir
    -0.15
    lez
    -0.15
    ugi
    -0.14
    aped
    -0.14
    075
    -0.14
    -desktop
    -0.14
    POSITIVE LOGITS
    ámara
    0.14
    ::
    0.14
    Spo
    0.14
     possibilities
    0.14
     ÑĤа
    0.13
     Tang
    0.13
     Nora
    0.13
    /cms
    0.13
     tj
    0.13
     bicy
    0.13
    Act Density 0.001%

    No Known Activations