INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     untouched
    -0.08
     Guid
    -0.08
     encompasses
    -0.08
    ливо
    -0.07
    ].[
    -0.07
    /build
    -0.07
     encompass
    -0.07
     વિસ્ત
    -0.07
     GUID
    -0.07
    лив
    -0.07
    POSITIVE LOGITS
     thanking
    0.08
     धन्यवाद
    0.08
     Thank
    0.08
    回应
    0.08
    感谢
    0.08
    自拍
    0.08
    uy
    0.08
     Teb
    0.08
     thanked
    0.08
     gratitude
    0.08
    Act Density 0.007%

    No Known Activations