INDEX
    Explanations

    lists after punctuation

    New Auto-Interp
    Negative Logits
     
    0.55
    ÇÃO
    0.42
     °
    0.40
    0.40
    0.40
    0.39
     م
    0.38
    ปี
    0.38
     VS
    0.38
     uit
    0.37
    POSITIVE LOGITS
     allowing
    0.57
     Allows
    0.55
     позволя
    0.55
    you
    0.54
     Allowing
    0.54
    allowing
    0.54
    जिसे
    0.53
     You
    0.52
    Allows
    0.52
    nYou
    0.52
    Act Density 0.000%

    No Known Activations