INDEX
    Explanations

    phrases indicating concern or care for others

    New Auto-Interp
    Negative Logits
    viso
    -0.16
    ÑĨин
    -0.16
    uga
    -0.16
    ummer
    -0.15
    esor
    -0.15
    651
    -0.15
     Crown
    -0.14
    ovich
    -0.14
    alian
    -0.14
    ưá»Ŀi
    -0.14
    POSITIVE LOGITS
    rieb
    0.15
    ÙĤب
    0.15
    lied
    0.15
     pipeline
    0.14
     and
    0.14
     Banc
    0.14
     conduit
    0.14
    æľºåħ³
    0.14
     Lap
    0.14
    squ
    0.14
    Act Density 0.015%

    No Known Activations