INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Forward
    0.62
    Forward
    0.59
    Foreground
    0.58
     Foreground
    0.56
     Front
    0.55
     Forecast
    0.50
     Frontend
    0.50
     foreground
    0.49
     Forecasting
    0.49
    forward
    0.49
    POSITIVE LOGITS
     look
    0.52
    look
    0.52
    Look
    0.51
     fram
    0.46
     اهلا
    0.43
    LOOK
    0.43
    akoti
    0.40
     පිළිබ
    0.39
    fram
    0.39
     bardzo
    0.39
    Act Density 0.002%

    No Known Activations