INDEX
    Explanations

    elements related to social interactions and contexts

    New Auto-Interp
    Negative Logits
     even
    -0.25
    even
    -0.20
     sogar
    -0.19
     especially
    -0.19
     also
    -0.19
     too
    -0.18
    竣
    -0.18
     despite
    -0.18
     both
    -0.17
    -even
    -0.17
    POSITIVE LOGITS
     nothing
    0.46
     NOTHING
    0.41
    nothing
    0.40
     Nothing
    0.36
     thôi
    0.36
    Nothing
    0.34
     nada
    0.28
     nowhere
    0.28
     ниÑĩего
    0.27
     saja
    0.26
    Act Density 0.060%

    No Known Activations