INDEX
    Explanations

    phrases indicating logical contradictions

    New Auto-Interp
    Negative Logits
     varandra
    -0.62
    glise
    -0.62
    curité
    -0.60
    ">//
    -0.57
    BERGER
    -0.56
    regler
    -0.54
     regler
    -0.53
     verksamhet
    -0.52
    rous
    -0.52
    Boas
    -0.51
    POSITIVE LOGITS
     Pristupljeno
    0.75
     فريبيس
    0.62
    RegressionTest
    0.61
     tartalomajánló
    0.60
    NewGuid
    0.59
    ynchronously
    0.58
    randomUUID
    0.57
    Krak
    0.57
     Rag
    0.55
    FRAG
    0.55
    Act Density 0.005%

    No Known Activations