INDEX
    Explanations

    phrases that indicate the initiation of processes, actions, or contributions in discussions

    New Auto-Interp
    Negative Logits
    aco
    -0.16
    chalk
    -0.15
    UserDefaults
    -0.14
    \<^
    -0.14
    ÑĤаб
    -0.14
     Îķλλην
    -0.14
    UILTIN
    -0.14
    ourke
    -0.14
    gross
    -0.14
    avig
    -0.14
    POSITIVE LOGITS
    imation
    0.16
     Ney
    0.15
    achu
    0.15
     Gaut
    0.15
    agos
    0.14
     tir
    0.14
     relative
    0.14
    iosis
    0.14
    oles
    0.14
    azi
    0.14
    Act Density 0.001%

    No Known Activations