INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     збільш
    -0.07
    Telefono
    -0.07
    Keep
    -0.06
     tekrar
    -0.06
     itemType
    -0.06
    innamon
    -0.06
     KEEP
    -0.06
     chois
    -0.06
     timp
    -0.06
     sizin
    -0.06
    POSITIVE LOGITS
    新闻
    0.07
    //
    0.07
    after
    0.06
     된다
    0.06
    เผ
    0.06
    -----↵↵
    0.06
     اه
    0.06
    ----↵↵
    0.06
     hunt
    0.06
     lanc
    0.06
    Act Density 0.011%

    No Known Activations