INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    oa̍t
    -0.60
    parsedMessage
    -0.57
     Flare
    -0.56
    -0.56
    Flare
    -0.55
     Drill
    -0.55
     Wikimedijinoj
    -0.54
    Privacy
    -0.53
    Flex
    -0.53
     Confidentiality
    -0.51
    POSITIVE LOGITS
    according
    0.79
    According
    0.77
     According
    0.72
     Según
    0.70
    Secondo
    0.69
     according
    0.65
    Según
    0.64
    según
    0.63
     Secondo
    0.61
    Selon
    0.60
    Act Density 0.012%

    No Known Activations