INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    <bos>
    -0.57
    Chham
    -0.50
     estekak
    -0.46
    verwijspagina
    -0.45
     Rocca
    -0.44
     rockets
    -0.43
     famously
    -0.42
     bunnies
    -0.42
     surla
    -0.42
     campaigned
    -0.42
    POSITIVE LOGITS
     satisfied
    1.70
    satisfied
    1.57
     Satisfied
    1.51
    Satisfied
    1.39
    atisfied
    1.14
     satisfe
    1.04
     satisfecho
    1.02
     tevreden
    0.93
     zufrieden
    0.85
     satisfait
    0.85
    Act Density 0.005%

    No Known Activations