INDEX
    Explanations

    phrases indicating a minimum amount or threshold

    New Auto-Interp
    Negative Logits
     even
    -0.18
     EVEN
    -0.17
     actually
    -0.16
    aphore
    -0.16
    even
    -0.16
    anio
    -0.16
    çĶļèĩ³
    -0.15
     ÑħоÑĤÑı
    -0.15
     either
    -0.15
    uchen
    -0.15
    POSITIVE LOGITS
     until
    0.22
     according
    0.21
    until
    0.20
    Until
    0.19
     ones
    0.18
     Until
    0.18
    according
    0.16
    asm
    0.15
     According
    0.15
     hasta
    0.15
    Act Density 0.025%

    No Known Activations