INDEX
    Explanations

    phrases indicating changes or conversions

    New Auto-Interp
    Negative Logits
    "}")
    -0.70
    hésite
    -0.64
     Kenne
    -0.63
     Dade
    -0.57
    peper
    -0.56
     comigo
    -0.55
    testnet
    -0.55
    ednesdays
    -0.55
    Diwedd
    -0.54
    معلومات
    -0.54
    POSITIVE LOGITS
    值为
    0.71
     become
    0.71
     a
    0.68
    Become
    0.67
     deviennent
    0.65
     diventare
    0.65
    改为
    0.62
    变成
    0.62
    zerw
    0.62
     becoming
    0.62
    Act Density 0.422%

    No Known Activations