INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     
    1.21
    :
    1.06
    ?
    0.93
     czyli
    0.93
    .
    0.92
    ,
    0.86
     (
    0.82
     -
    0.82
     this
    0.80
     itself
    0.79
    POSITIVE LOGITS
    provide
    0.91
     provide
    0.79
     entwickeln
    0.77
    <unused474>
    0.76
    anggap
    0.73
    ផ្តល់
    0.72
     લોકોને
    0.72
    stopwords
    0.71
     collaborate
    0.71
    dengan
    0.70
    Act Density 0.003%

    No Known Activations