INDEX
    Explanations

    references to environmental activism and political efforts

    New Auto-Interp
    Negative Logits
    Specifier
    -0.18
    ÌĢ
    -0.17
    pcm
    -0.17
    ãĤ
    -0.17
    iginal
    -0.16
    .Interop
    -0.15
    IGNORE
    -0.14
    ÏĦηÏĥη
    -0.14
    ربÙĩ
    -0.14
    emann
    -0.14
    POSITIVE LOGITS
     ourselves
    0.27
     our
    0.27
     yourselves
    0.25
    our
    0.21
     ours
    0.20
    æĪij们çļĦ
    0.18
     nossa
    0.18
     nuestra
    0.17
     наÑĪей
    0.17
     nosso
    0.17
    Act Density 0.108%

    No Known Activations