INDEX
    Explanations

    phrases indicating repeated occurrences or frequency

    New Auto-Interp
    Negative Logits
    assin
    -0.17
    ateur
    -0.17
    ktor
    -0.14
     OTHERWISE
    -0.14
    WR
    -0.14
    ricular
    -0.14
     Ellis
    -0.14
    FRING
    -0.13
     Cove
    -0.13
    lee
    -0.13
    POSITIVE LOGITS
    759
    0.17
    ót
    0.15
    ãĥ©ãĤ¹
    0.15
    Flip
    0.14
    stoi
    0.14
     Sweep
    0.14
     sweep
    0.13
    ави
    0.13
     Sender
    0.13
    etak
    0.13
    Act Density 0.003%

    No Known Activations