INDEX
    Explanations

    expressions of personal frustrations or desires in informal language

    Contractions followed by certain words

    common conversational phrases

    New Auto-Interp
    Negative Logits
    ReusableCell
    -0.74
     laun
    -0.73
    Portail
    -0.66
    ()}>
    -0.65
     Wheeler
    -0.62
     Rine
    -0.62
    Étienne
    -0.61
    cstdlib
    -0.61
    ̯
    -0.60
     Erdoğan
    -0.60
    POSITIVE LOGITS
     يتيمه
    0.90
     المعيارى
    0.87
     تانيه
    0.84
    ölcs
    0.82
     äta
    0.81
     ujednoznacz
    0.79
     wuß
    0.78
    fycat
    0.78
    orszá
    0.76
    koľ
    0.75
    Act Density 0.244%

    No Known Activations