INDEX
    Explanations

    phrases indicating reference or direction

    New Auto-Interp
    Negative Logits
    ff
    -0.07
    ck
    -0.07
    ament
    -0.06
    ez
    -0.06
     al
    -0.06
    oundary
    -0.06
     today
    -0.06
     nobody
    -0.06
     Brit
    -0.05
    irit
    -0.05
    POSITIVE LOGITS
    erview
    0.08
    ouncil
    0.08
    æĻĵ
    0.07
    engin
    0.07
    enville
    0.07
    illard
    0.07
    ÄŁÃ¼
    0.07
    abama
    0.07
    ACHER
    0.07
    çŁ
    0.07
    Act Density 0.002%

    No Known Activations