INDEX
    Explanations

    statements about truthfulness or falsehood

    New Auto-Interp
    Negative Logits
    ReusableCell
    -0.46
    )"),
    -0.44
     taal
    -0.42
    \{\\
    -0.42
     indietro
    -0.41
     îna
    -0.41
     tilbake
    -0.39
     vermelhas
    -0.39
     rojas
    -0.39
     tilbage
    -0.38
    POSITIVE LOGITS
     kasarigan
    0.99
     intptr
    0.83
     autorytatywna
    0.78
     estekak
    0.75
    twimg
    0.75
     typelib
    0.72
     ivelany
    0.72
    awaiter
    0.70
     <<<<<<<<<<<<<<
    0.69
    rungsseite
    0.68
    Act Density 0.859%

    No Known Activations