INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Feb
    -0.08
     رفت
    -0.07
    láš
    -0.07
    GOP
    -0.07
    aaa
    -0.07
     Kay
    -0.07
    Cart
    -0.07
    Tag
    -0.07
     evacuation
    -0.06
     Dependency
    -0.06
    POSITIVE LOGITS
     shines
    0.09
     shine
    0.09
     Shine
    0.09
     shining
    0.08
    shine
    0.08
     sunshine
    0.08
     in
    0.08
     Sunshine
    0.08
    in
    0.08
    0.07
    Act Density 0.004%

    No Known Activations