INDEX
    Explanations

    pronoun followed by a verb

    New Auto-Interp
    Negative Logits
    其他
    0.66
     lorsque
    0.66
     amelyek
    0.64
     подходя
    0.61
     demás
    0.61
     其他
    0.61
    的其他
    0.58
    From
    0.57
     اخرى
    0.56
    我认为
    0.56
    POSITIVE LOGITS
     can
    1.02
     owes
    1.01
     may
    1.01
     thrives
    0.97
     owns
    0.97
     lacks
    0.97
     seems
    0.96
     has
    0.95
     hasn
    0.95
     is
    0.95
    Act Density 0.341%

    No Known Activations