INDEX
    Explanations

    statements that emphasize the importance of facts or factual evidence

    New Auto-Interp
    Negative Logits
    اÙģØª
    -0.19
    stras
    -0.17
    lops
    -0.15
    ÅĽcie
    -0.15
    dings
    -0.15
    mts
    -0.14
    mtree
    -0.14
    holm
    -0.14
     indirect
    -0.14
    ÅĽmy
    -0.14
    POSITIVE LOGITS
    ually
    0.30
    itious
    0.27
     fact
    0.26
    oring
    0.26
    oid
    0.25
    ored
    0.24
    uality
    0.24
    um
    0.23
     Fact
    0.22
    oids
    0.20
    Act Density 0.027%

    No Known Activations