INDEX
    Explanations

    instances of factual statements or claims

    New Auto-Interp
    Negative Logits
     therefore
    -0.17
    thus
    -0.17
     thus
    -0.16
    же
    -0.15
    Therefore
    -0.14
    次
    -0.13
    UTE
    -0.13
     à¤ĩसल
    -0.13
    oven
    -0.13
    vens
    -0.13
    POSITIVE LOGITS
     according
    0.30
     According
    0.28
     à¤ĩसम
    0.25
    According
    0.24
    according
    0.23
     here
    0.23
    åħ¶ä¸Ń
    0.23
    该
    0.23
     Among
    0.22
     therein
    0.22
    Act Density 0.116%

    No Known Activations