INDEX
    Explanations

    questions and interrogative phrases

    New Auto-Interp
    Negative Logits
    -0.94
    <bos>
    -0.91
    Lmfao
    -0.75
    <?
    -0.73
    Hahahahaha
    -0.71
    Hahah
    -0.69
    /**
    -0.69
    Lma
    -0.69
    Noice
    -0.62
    
    
    -0.60
    POSITIVE LOGITS
     lemp
    0.94
     Valentín
    0.93
     quoc
    0.91
     ananas
    0.88
     paradiso
    0.88
     barbacoa
    0.88
     cristo
    0.86
     thuy
    0.85
     nuoc
    0.85
     paloma
    0.85
    Act Density 0.107%

    No Known Activations