INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     fest
    -0.07
    gis
    -0.06
    qua
    -0.06
    อร
    -0.06
    _Do
    -0.06
     Singular
    -0.06
     rigor
    -0.06
     familial
    -0.06
    Eu
    -0.06
     Fou
    -0.06
    POSITIVE LOGITS
     tokens
    0.07
    คาส
    0.07
    لس
    0.07
    logger
    0.07
     Thiên
    0.07
     adjustment
    0.06
    Creat
    0.06
     habit
    0.06
    .tasks
    0.06
     зм
    0.06
    Act Density 0.025%

    No Known Activations