INDEX
    Explanations

    phrases that express a significant quantity or degree

    New Auto-Interp
    Negative Logits
    Unchecked
    -0.18
    ubic
    -0.15
    AGER
    -0.15
     nonatomic
    -0.14
    aits
    -0.14
    UBL
    -0.14
    aterno
    -0.14
    à¹Ģà¸ľ
    -0.14
    .Factory
    -0.14
    exels
    -0.14
    POSITIVE LOGITS
     ado
    0.20
    563
    0.16
    ammad
    0.15
    uh
    0.15
    ilent
    0.14
    romatic
    0.14
    -needed
    0.14
    kem
    0.14
    809
    0.14
    wu
    0.14
    Act Density 0.028%

    No Known Activations