INDEX
    Explanations

    interpersonal relationships, feelings

    New Auto-Interp
    Negative Logits
     his
    -0.08
     he
    -0.08
     coffin
    -0.07
    Autor
    -0.07
    (entity
    -0.07
     He
    -0.07
    -0.07
     punishment
    -0.07
    дел
    -0.06
     projectiles
    -0.06
    POSITIVE LOGITS
    lama
    0.07
    ática
    0.06
    _numbers
    0.06
     Haz
    0.06
     STATIC
    0.06
     ruce
    0.06
     resolves
    0.06
    weights
    0.06
    .Math
    0.06
     minha
    0.06
    Act Density 0.778%

    No Known Activations