INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Minor
    -0.08
     minor
    -0.08
     uitzonder
    -0.07
     том
    -0.07
    Minor
    -0.07
    -0.07
    ാറ
    -0.07
    -0.07
    umin
    -0.07
     causes
    -0.07
    POSITIVE LOGITS
     ours
    0.09
     Yours
    0.08
     Beck
    0.08
     Jub
    0.08
    Nk
    0.08
     yours
    0.08
     Natalie
    0.08
     gị
    0.08
     ikaw
    0.07
    0.07
    Act Density 0.010%

    No Known Activations