INDEX
    Explanations

    phrases with special characters and symbols like arrows

    symbols or special characters used in different contexts

    New Auto-Interp
    Negative Logits
     scatter
    -0.78
     dirt
    -0.72
     cyan
    -0.65
     blond
    -0.63
     rooting
    -0.63
    wagen
    -0.62
    lda
    -0.62
     bung
    -0.62
     sled
    -0.62
     secretary
    -0.61
    POSITIVE LOGITS
    £
    1.13
    âĢł
    0.99
    ¹
    0.97
    º
    0.96
    į
    0.95
    ¢
    0.94
    âĹ¼
    0.93
    ¡
    0.92
    catentry
    0.91
    ı
    0.89
    Act Density 0.901%

    No Known Activations