INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     wor
    -0.12
    ONO
    -0.09
    ono
    -0.09
    wor
    -0.09
     Await
    -0.08
     challenged
    -0.08
    amar
    -0.08
     nghiá»ĩ
    -0.08
     invitation
    -0.08
     Garr
    -0.08
    POSITIVE LOGITS
     Wants
    0.22
     needs
    0.21
    Needs
    0.21
     Needs
    0.21
     wants
    0.20
    needs
    0.19
     desires
    0.18
     wishes
    0.16
    éľĢæ±Ĥ
    0.16
     ìļķ
    0.16
    Act Density 0.121%

    No Known Activations