INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    θος
    -0.07
    apeutic
    -0.07
    Finish
    -0.07
    -payment
    -0.06
    rous
    -0.06
    enburg
    -0.06
    minimal
    -0.06
    اءة
    -0.06
    σω
    -0.06
     Bernard
    -0.06
    POSITIVE LOGITS
    autocomplete
    0.07
     ovar
    0.07
     fint
    0.07
     ประก
    0.06
     zap
    0.06
     rooftop
    0.06
     Suarez
    0.06
     omp
    0.06
     upscale
    0.06
     """↵
    0.06
    Act Density 0.022%

    No Known Activations