INDEX
    Explanations

    the pronoun "we" in various contexts

    New Auto-Interp
    Negative Logits
    ,
    -0.38
    ly
    -0.34
     another
    -0.33
     other
    -0.33
     JADX
    -0.32
     of
    -0.31
     foreign
    -0.31
    -0.30
     Autre
    -0.30
     autre
    -0.30
    POSITIVE LOGITS
    We
    1.20
     we
    1.05
     We
    1.00
    Mereka
    0.98
    Мы
    0.94
     เรา
    0.92
     ſind
    0.91
    they
    0.90
     CreateTagHelper
    0.90
    mereka
    0.89
    Act Density 0.175%

    No Known Activations