INDEX
    Explanations

    repetitive expressions of agreement or acknowledgement

    New Auto-Interp
    Negative Logits
    tic
    -0.75
    nin
    -0.71
    c
    -0.71
    ni
    -0.70
    len
    -0.69
    nik
    -0.67
    ity
    -0.65
    ment
    -0.65
    se
    -0.65
    ls
    -0.64
    POSITIVE LOGITS
     also
    1.29
     ALSO
    1.26
    ALSO
    1.25
    кож
    1.22
     Também
    1.21
    gså
    1.16
     וגם
    1.12
    also
    1.12
    wnież
    1.11
     También
    1.10
    Act Density 0.158%

    No Known Activations