INDEX
    Explanations

    mentions of inconsistencies or discrepancies in reasoning or arguments

    New Auto-Interp
    Negative Logits
    makeText
    -0.64
     Netz
    -0.63
    @[+][
    -0.61
    estens
    -0.58
     Dienst
    -0.57
     Erde
    -0.57
    voyez
    -0.57
    sizeCache
    -0.57
    ruzzo
    -0.57
    SerializedSize
    -0.57
    POSITIVE LOGITS
     discrepancy
    0.99
     contradictions
    0.94
     Incon
    0.93
     discrepancies
    0.92
     contradiction
    0.85
     Discre
    0.81
    Zeneca
    0.80
     contradictory
    0.79
    discre
    0.78
     contradic
    0.75
    Act Density 0.024%

    No Known Activations