INDEX
    Explanations

    the word "fact" and phrases indicating certainty or factual statements

    New Auto-Interp
    Negative Logits
    ULD
    -0.15
    asser
    -0.15
     Manson
    -0.14
    lte
    -0.14
    ledger
    -0.14
    se
    -0.14
    EventListener
    -0.14
    Ðĭ
    -0.13
    à¥įà¤Ĺ
    -0.13
    ams
    -0.13
    POSITIVE LOGITS
     fact
    0.20
     that
    0.20
    itious
    0.17
     bahwa
    0.17
    avana
    0.16
    egasus
    0.16
    585
    0.15
     Fact
    0.15
    that
    0.14
    586
    0.14
    Act Density 0.014%

    No Known Activations