INDEX
    Explanations

    instances of claims and accusations related to historical narratives and their verification or debunking

    New Auto-Interp
    Negative Logits
     hypoc
    -0.16
    μβ
    -0.15
    Unexpected
    -0.15
     Unexpected
    -0.15
     unpredict
    -0.15
    ronic
    -0.15
     unexpectedly
    -0.14
     تÙĦ
    -0.14
    akra
    -0.14
    umi
    -0.14
    POSITIVE LOGITS
     fiction
    0.29
     fabrication
    0.29
     unsupported
    0.28
     Fabric
    0.26
     fantasy
    0.26
     Fiction
    0.26
     hears
    0.25
     fanc
    0.25
    fabric
    0.25
    Fabric
    0.25
    Act Density 0.406%

    No Known Activations