INDEX
    Explanations

    themes related to social justice and equality

    New Auto-Interp
    Negative Logits
     olsun
    -0.15
     Ulus
    -0.14
    /generated
    -0.14
    ÐŀÐł
    -0.14
    é¨
    -0.14
    še
    -0.14
    caret
    -0.14
    gne
    -0.14
    ernet
    -0.14
     Fried
    -0.13
    POSITIVE LOGITS
     still
    0.58
     Still
    0.54
    still
    0.52
    Still
    0.50
     STILL
    0.47
    ä»į
    0.42
     ainda
    0.37
     masih
    0.36
     ancora
    0.35
     ÙĩÙĨÙĪØ²
    0.35
    Act Density 0.216%

    No Known Activations