INDEX
    Explanations

    the presence of quotation marks or apostrophes in the text

    New Auto-Interp
    Negative Logits
    \""
    -0.95
    }}"
    -0.89
    osoba
    -0.85
      “
    -0.85
     Menge
    -0.84
    {}".
    -0.84
    (",")
    -0.84
     "}
    -0.84
    }".
    -0.81
     Peque
    -0.80
    POSITIVE LOGITS
    !='
    1.08
     '
    1.05
    Ndr
    1.03
    ]='\
    0.95
     ='
    0.95
    =’
    0.94
     ('
    0.94
    ==='
    0.93
    >';
    
    0.93
     '.
    0.92
    Act Density 0.247%

    No Known Activations