INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     crust
    -0.72
    フォト
    -0.71
    Проце
    -0.71
    gaver
    -0.71
     яйца
    -0.71
     aco
    -0.70
     露
    -0.70
    Сту
    -0.68
    indicator
    -0.67
     volonté
    -0.67
    POSITIVE LOGITS
    Bias
    0.89
     genealogy
    0.82
    bias
    0.81
    allel
    0.80
     duiz
    0.79
    NIST
    0.78
     Rek
    0.77
     biased
    0.77
     AI
    0.76
     facial
    0.75
    Act Density 0.021%

    No Known Activations