INDEX
    Explanations

    misleading or false information in text, including statements that are incorrect or fraudulent

    New Auto-Interp
    Negative Logits
    iments
    -0.80
    ĸļ
    -0.79
    iment
    -0.74
    oleon
    -0.69
    ista
    -0.63
    isms
    -0.60
    iens
    -0.59
    arya
    -0.58
    achine
    -0.57
    illas
    -0.57
    POSITIVE LOGITS
    named
    0.68
     inflated
    0.67
     priced
    0.67
     diagnosed
    0.66
    ãĤ©
    0.66
     unfocusedRange
    0.65
    label
    0.64
     accused
    0.62
    ball
    0.62
     accuse
    0.62
    Act Density 7.534%

    No Known Activations