INDEX
    Explanations

    corresponds

    New Auto-Interp
    Negative Logits
    Pel
    -0.09
     Lud
    -0.08
     pel
    -0.08
     vere
    -0.08
     upset
    -0.07
    -0.07
     Wolf
    -0.07
    -0.07
    -0.07
     deline
    -0.07
    POSITIVE LOGITS
    _pas
    0.08
     prie
    0.07
    0.07
     Poh
    0.07
     Oc
    0.07
     hic
    0.07
     Petroleum
    0.07
     lực
    0.07
     gir
    0.07
    achi
    0.07
    Act Density 0.042%

    No Known Activations