INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     ",",
    -0.06
    (mp
    -0.06
     empath
    -0.06
    -0.06
    segments
    -0.06
    ();↵↵↵
    -0.06
    CAL
    -0.06
    ocop
    -0.06
    -0.06
    -0.06
    POSITIVE LOGITS
     vra
    0.07
     πριν
    0.07
     coins
    0.06
     mechanism
    0.06
     touring
    0.06
     native
    0.06
     Showcase
    0.06
     READ
    0.06
     energetic
    0.06
    Figure
    0.06
    Act Density 0.041%

    No Known Activations