INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     bar
    -0.84
     Bar
    -0.67
    <bos>
    -0.60
    Bar
    -0.59
    bar
    -0.53
     Re
    -0.49
    on
    -0.49
     BAR
    -0.48
     bars
    -0.45
     dø
    -0.45
    POSITIVE LOGITS
    :✨
    0.81
     Egli
    0.77
     $_"
    0.74
     незавершена
    0.72
     ostavi
    0.69
    transQ
    0.69
     merino
    0.68
    ItemBackground
    0.68
    Personendaten
    0.67
     sega
    0.66
    Act Density 0.037%

    No Known Activations