INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    愿意
    0.42
     verschil
    0.40
     फरक
    0.39
    Difference
    0.39
    願意
    0.39
    0.39
    difference
    0.38
     hafa
    0.38
     escre
    0.38
    [{\
    0.38
    POSITIVE LOGITS
     updated
    0.62
     revised
    0.58
     approved
    0.57
     confirmed
    0.57
     amended
    0.57
    updated
    0.56
     corrected
    0.56
     reviewed
    0.54
     inspected
    0.54
     deleted
    0.53
    Act Density 0.000%

    No Known Activations