INDEX
    Explanations

    but followed by pronoun or article

    New Auto-Interp
    Negative Logits
    ،
    0.27
    0.26
    0.22
    0.22
    в
    0.21
    0.21
    $,
    0.20
     sabbam
    0.20
    0.20
    0.20
    POSITIVE LOGITS
     it
    0.27
     in
    0.25
     at
    0.23
     certainly
    0.22
    Y
    0.21
     don
    0.21
    N
    0.21
     often
    0.20
    Imagine
    0.20
    V
    0.20
    Act Density 0.434%

    No Known Activations