INDEX
    Explanations

    references to the pronouns "you" and "us" indicating engagement or involvement

    New Auto-Interp
    Negative Logits
    DoubleQuotes
    -0.29
     jongen
    -0.27
     achtergrond
    -0.27
    อะไร
    -0.27
     disparu
    -0.26
     niega
    -0.25
     merveille
    -0.25
    CommonModule
    -0.25
     hablado
    -0.25
     inilah
    -0.25
    POSITIVE LOGITS
     kaarangay
    0.67
     Wikimedijinoj
    0.66
     ब्रेकडाउन
    0.66
    𞥄
    0.66
    <unused6>
    0.66
    <pad>
    0.66
    <unused55>
    0.65
    <unused76>
    0.65
    <unused41>
    0.65
    <unused17>
    0.65
    Act Density 0.015%

    No Known Activations