INDEX
    Explanations

    listing items, connecting phrases

    New Auto-Interp
    Negative Logits
    That
    1.08
     что
    1.02
    أن
    0.97
    0.91
    その
    0.90
     که
    0.89
     że
    0.89
    ने
    0.88
     що
    0.87
     это
    0.86
    POSITIVE LOGITS
     sebagainya
    0.91
    u
    0.87
     prover
    0.75
    riamo
    0.73
    ,\\
    0.73
    ,【
    0.73
    wiches
    0.71
     (#
    0.71
     fiasco
    0.70
    romeda
    0.70
    Act Density 0.303%

    No Known Activations