INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     originality
    -0.08
     tasa
    -0.08
     өзг
    -0.08
     sugest
    -0.08
    ":↵/
    -0.08
    ":["
    -0.08
    -paper
    -0.07
     กล่าวว่า
    -0.07
    -last
    -0.07
    )','
    -0.07
    POSITIVE LOGITS
    Found
    0.10
     Found
    0.10
    _found
    0.10
     encontrou
    0.10
     FOUND
    0.09
    FOUND
    0.09
    _FOUND
    0.09
    Detected
    0.08
    (found
    0.08
     found
    0.08
    Act Density 0.002%

    No Known Activations