INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -city
    -0.07
    tie
    -0.07
     pathways
    -0.07
    ík
    -0.06
     pathway
    -0.06
     pillars
    -0.06
     famine
    -0.06
    _retry
    -0.06
     tým
    -0.06
     neutr
    -0.06
    POSITIVE LOGITS
     Brush
    0.17
     brush
    0.17
    Brush
    0.14
    brush
    0.14
     brushes
    0.12
     brushed
    0.11
     Brushes
    0.10
     brushing
    0.10
    ush
    0.08
    0.07
    Act Density 0.004%

    No Known Activations