INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ,
    1.76
    ،
    1.42
    -,
    1.33
    -
    1.32
    /
    1.32
    ,}
    1.23
    -/
    1.23
    --
    1.19
    ...),
    1.18
    ...,
    1.15
    POSITIVE LOGITS
     are
    1.42
     along
    1.28
     które
    1.27
     often
    1.25
     came
    1.24
     which
    1.22
     которые
    1.20
     जिनमें
    1.20
     żeby
    1.19
     were
    1.18
    Act Density 0.219%

    No Known Activations