INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.08
    _
    -0.08
    "]
    -0.08
     insbesondere
    -0.08
    $(
    -0.07
    "
    -0.07
     Various
    -0.07
    "]:↵
    -0.07
    -0.07
     logo
    -0.07
    POSITIVE LOGITS
     פחות
    0.14
     ironically
    0.13
     vähem
    0.13
     کمتر
    0.13
     reluctantly
    0.13
     বিরুদ্ধে
    0.12
     less
    0.12
     наоборот
    0.11
     avoidance
    0.11
     avoided
    0.11
    Act Density 0.981%

    No Known Activations