INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    and
    0.62
    reform
    0.51
    on
    0.50
    prisma
    0.47
    ma
    0.47
    un
    0.46
    uck
    0.45
    op
    0.45
    s
    0.45
    but
    0.44
    POSITIVE LOGITS
     το
    0.55
    0.50
     τα
    0.48
     الخلق
    0.47
     τον
    0.47
     φ
    0.47
     μεγάλη
    0.46
    0.46
     Το
    0.46
    Deviation
    0.46
    Act Density 0.003%

    No Known Activations