INDEX
    Explanations

    phrases that emphasize frequency or degree of experience

    New Auto-Interp
    Negative Logits
     гораздо
    -0.48
     удобно
    -0.47
    arakat
    -0.45
     hơn
    -0.44
     besser
    -0.43
     nzuri
    -0.42
     Efq
    -0.42
    verty
    -0.41
    ERE
    -0.41
     Arce
    -0.41
    POSITIVE LOGITS
     fier
    0.80
     prou
    0.78
     dir
    0.77
     classi
    0.72
     cra
    0.72
     flas
    0.71
     humb
    0.71
     nas
    0.71
     migh
    0.69
     HAP
    0.67
    Act Density 0.389%

    No Known Activations