INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     CDN
    -0.07
     Centre
    -0.07
    toHave
    -0.07
    Invite
    -0.07
     one
    -0.07
    acin
    -0.07
    Lead
    -0.06
     ONE
    -0.06
    Variant
    -0.06
    kul
    -0.06
    POSITIVE LOGITS
     speech
    0.19
     Speech
    0.15
    speech
    0.13
     speeches
    0.12
    peech
    0.11
    Speech
    0.11
     grief
    0.07
     نماز
    0.07
    GH
    0.07
     Spielberg
    0.07
    Act Density 0.007%

    No Known Activations