INDEX
    Explanations

    more detailed or comparative

    New Auto-Interp
    Negative Logits
    !
    0.61
     thousands
    0.54
     phenomena
    0.53
     countless
    0.53
     and
    0.51
     either
    0.50
     these
    0.50
    ?
    0.48
     henceforth
    0.48
     permeated
    0.48
    POSITIVE LOGITS
     πιο
    0.80
     bardziej
    0.80
    0.76
    0.74
     Lebih
    0.73
     Variante
    0.71
     більш
    0.70
     divertida
    0.68
     ساده
    0.66
     lebih
    0.66
    Act Density 0.009%

    No Known Activations