INDEX
    Explanations

    phrases that indicate substantial reasoning or justification for claims

    New Auto-Interp
    Negative Logits
    IsMutable
    -0.68
     defaultstate
    -0.66
    apimachinery
    -0.60
    \}\\
    -0.60
    stdafx
    -0.56
     متعلقه
    -0.55
    WebControls
    -0.55
    "});
    -0.55
    fjspx
    -0.55
    NameInMap
    -0.53
    POSITIVE LOGITS
     legitimate
    0.66
     legit
    0.65
     admit
    0.64
    Autoritní
    0.64
     valid
    0.64
     truth
    0.63
     Honest
    0.59
    的确
    0.58
    确实
    0.58
     VALID
    0.58
    Act Density 0.081%

    No Known Activations