INDEX
    Explanations

    special characters or symbols in the text

    New Auto-Interp
    Negative Logits
    àµįà´
    -0.17
    /thumb
    -0.16
    ££
    -0.14
    àµ
    -0.14
    ,**
    -0.14
    âĢĤ
    -0.14
    â
    -0.14
    
    -0.14
    oje
    -0.14
    Ô
    -0.13
    POSITIVE LOGITS
    âĶĢâĶĢ
    0.32
    âķ
    0.28
    âĶ
    0.26
     âĶ
    0.24
    âĶĢ
    0.22
     âķ
    0.22
     âĶľâĶĢâĶĢ
    0.22
    âĶĢâĶĢâĶĢâĶĢ
    0.21
    âͬ
    0.21
    âķIJ
    0.20
    Act Density 0.001%

    No Known Activations