INDEX
    Explanations

    phrases that describe concepts

    New Auto-Interp
    Negative Logits
    0.23
    0.23
     ഇത്തരം
    0.22
    whom
    0.22
     lakini
    0.22
     nhưng
    0.22
     яку
    0.22
     wobei
    0.22
     പറയുന്നു
    0.21
     परंतु
    0.21
    POSITIVE LOGITS
     that
    0.47
     thats
    0.36
     solely
    0.31
     specifically
    0.30
     designed
    0.30
    that
    0.30
     explicitly
    0.30
     purely
    0.29
     thay
    0.28
     deemed
    0.28
    Act Density 0.554%

    No Known Activations