INDEX
    Explanations

    verbs followed by 'and' or comma

    New Auto-Interp
    Negative Logits
     heritage
    0.45
     hanging
    0.43
     itp
    0.42
     Մ
    0.42
     uniquement
    0.40
     plays
    0.39
    也可以
    0.39
     संबंधी
    0.38
    或其他
    0.38
     march
    0.38
    POSITIVE LOGITS
     आणि
    1.52
     અને
    1.48
     and
    1.45
     এবং
    1.44
     और
    1.42
    1.42
    และ
    1.40
     ਅਤੇ
    1.40
     και
    1.38
     и
    1.34
    Act Density 0.061%

    No Known Activations