INDEX
    Explanations

    the phrase "ain't" and its variations, indicating a focus on informal or colloquial language

    New Auto-Interp
    Negative Logits
    ге
    -0.17
    erable
    -0.15
    utsch
    -0.15
    reu
    -0.15
    hausen
    -0.15
    idge
    -0.15
    laces
    -0.15
    主
    -0.14
    ñas
    -0.14
     enr
    -0.14
    POSITIVE LOGITS
    't
    0.18
    ’t
    0.16
     Soph
    0.16
    agle
    0.15
    tright
    0.15
    ult
    0.15
    ixo
    0.15
    dio
    0.14
    ÑĢовиÑĩ
    0.14
    éli
    0.14
    Act Density 0.008%

    No Known Activations