INDEX
    Explanations

    sentences that express feelings of happiness and positivity

    New Auto-Interp
    Negative Logits
     nephe
    -0.75
     hierogly
    -0.75
     hyal
    -0.72
     erythro
    -0.72
     Picchu
    -0.71
     causation
    -0.70
    ükemmel
    -0.70
    ediakan
    -0.70
    ulongan
    -0.69
     sahaja
    -0.69
    POSITIVE LOGITS
     kont
    0.51
     kel
    0.44
     bes
    0.44
     dan
    0.43
     lont
    0.43
     ber
    0.43
     di
    0.42
     ke
    0.41
     dengan
    0.39
     meng
    0.39
    Act Density 0.180%

    No Known Activations