INDEX
    Explanations

    expressions indicating personal opinions or self-descriptions

    New Auto-Interp
    Negative Logits
     contextLoads
    -0.91
    <?
    -0.79
    UpInside
    -0.77
     Offisielt
    -0.76
    berdayakan
    -0.75
    featureID
    -0.71
    NUMX
    -0.70
    (:,:,
    -0.69
    GIVEREF
    -0.69
    存于互联网档案馆
    -0.66
    POSITIVE LOGITS
     Labor
    0.46
     soal
    0.45
     Flo
    0.45
    Labor
    0.42
     content
    0.42
     labor
    0.41
    jaan
    0.40
     cal
    0.40
     tem
    0.40
    ын
    0.40
    Act Density 0.001%

    No Known Activations