INDEX
    Explanations

    anced/ivated

    New Auto-Interp
    Negative Logits
    357
    -0.07
     excuses
    -0.06
    ็กซ
    -0.06
     слиз
    -0.06
     persec
    -0.06
     Peer
    -0.06
    Hom
    -0.06
    iphy
    -0.06
     minh
    -0.06
    (Pos
    -0.06
    POSITIVE LOGITS
     fascination
    0.10
     fascinated
    0.10
     захоп
    0.08
     captivating
    0.08
     enchant
    0.07
     capturing
    0.07
     bew
    0.07
    forEach
    0.06
     Fra
    0.06
     Null
    0.06
    Act Density 0.015%

    No Known Activations