INDEX
    Explanations

    phrases indicating confirmation or assurance of previous statements

    New Auto-Interp
    Negative Logits
    this
    -0.18
     this
    -0.18
    éĤ£æł·
    -0.15
       
    -0.15
    zer
    -0.15
    è¿Ļä¸Ģ
    -0.15
     ts
    -0.15
    .ts
    -0.15
    thus
    -0.15
     pilot
    -0.14
    POSITIVE LOGITS
     happen
    0.17
     happening
    0.17
     happens
    0.17
    ìłĢ
    0.15
    WithTitle
    0.15
    íĮĶ
    0.15
     happened
    0.15
    ãĥ¼ãĥIJ
    0.15
    Rick
    0.14
    Them
    0.14
    Act Density 0.275%

    No Known Activations