INDEX
    Explanations

    instances of contrast or contradiction in statements

    New Auto-Interp
    Negative Logits
     مرئيه
    -0.75
     disponibilités
    -0.69
    anskje
    -0.67
    ."</
    -0.66
    ".
    
    -0.63
     Baillargeon
    -0.59
     Jefus
    -0.59
     '\\;'
    -0.58
    reszcie
    -0.57
     PLWABN
    -0.57
    POSITIVE LOGITS
    とも
    0.54
    évaluateur
    0.48
    ungsbedingungen
    0.48
     condenser
    0.47
     kaynağından
    0.47
    と思
    0.46
     برانيه
    0.46
    RTLD
    0.45
    шма
    0.45
    пыт
    0.44
    Act Density 0.417%

    No Known Activations