INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     любое
    -0.09
    -0.08
    jw
    -0.08
    Satellite
    -0.07
     Herald
    -0.07
     segala
    -0.07
     demikian
    -0.07
    athlon
    -0.07
    heids
    -0.07
    heden
    -0.07
    POSITIVE LOGITS
    (poly
    0.08
     sequer
    0.08
     لس
    0.08
     gerek
    0.08
    ,更
    0.07
     ق
    0.07
     सिं
    0.07
    (pol
    0.07
    ্র
    0.07
     tril
    0.07
    Act Density 0.021%

    No Known Activations