INDEX
    Explanations

    occurrences of commas in the text

    New Auto-Interp
    Negative Logits
     eux
    -0.19
     THEM
    -0.17
    them
    -0.17
     ниÑħ
    -0.16
     lui
    -0.15
     нÑĮого
    -0.15
     ragaz
    -0.14
     them
    -0.14
     ØŃاÙĦÛĮ
    -0.14
     него
    -0.13
    POSITIVE LOGITS
     there
    0.50
     it
    0.46
     we
    0.35
    there
    0.35
     they
    0.28
     you
    0.26
     many
    0.26
     Ù쨥ÙĨ
    0.25
    ,it
    0.25
     nothing
    0.25
    Act Density 0.501%

    No Known Activations