INDEX
    Explanations

    phrases indicating additional context or details in discussions

    New Auto-Interp
    Negative Logits
    APT
    -0.16
    entario
    -0.15
    angkan
    -0.15
    jist
    -0.14
    hive
    -0.14
    uali
    -0.14
    รม
    -0.14
    stras
    -0.14
    Limits
    -0.14
    are
    -0.14
    POSITIVE LOGITS
    ovit
    0.15
    apia
    0.15
    δεÏĤ
    0.15
    oppins
    0.14
    çŃij
    0.14
    *)((
    0.14
    burn
    0.14
    ê°Ŀ
    0.14
     Welch
    0.14
    unsch
    0.14
    Act Density 0.010%

    No Known Activations