INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    alogy
    -0.95
     professed
    -0.94
    出了
    -0.92
     sustenance
    -0.91
    !」
    -0.89
     people
    -0.85
    ?」
    -0.82
     islet
    -0.82
    zzazione
    -0.82
     importance
    -0.82
    POSITIVE LOGITS
     polecam
    1.04
     cliquant
    1.01
    смотрим
    0.98
     čier
    0.94
    0.94
     cinquième
    0.93
     pią
    0.92
    posy
    0.92
     няколко
    0.92
     venido
    0.91
    Act Density 0.007%

    No Known Activations