INDEX
    Explanations

    dialogue and conversational exchanges

    Text after punctuation/parenthesis

    jokes, laughter, or humor

    New Auto-Interp
    Negative Logits
    Chham
    -0.36
    IContainer
    -0.34
    habad
    -0.33
    sifs
    -0.32
    Vidite
    -0.32
     Grit
    -0.31
     доступ
    -0.31
     Divers
    -0.30
    meiras
    -0.30
     OnInit
    -0.30
    POSITIVE LOGITS
     الحره
    0.59
     plegable
    0.56
     laughter
    0.54
    SharedCtor
    0.52
     jouets
    0.51
    RegressionTest
    0.51
     joking
    0.50
     yürü
    0.49
     joke
    0.49
     laughing
    0.49
    Act Density 0.545%

    No Known Activations