INDEX
    Explanations

    forest, Plateau, Offensive

    New Auto-Interp
    Negative Logits
    िया
    1.89
    1.68
     мнению
    1.59
     vetting
    1.57
     philanthropic
    1.53
    ান্ড
    1.49
    1.49
     लगाया
    1.47
     reputation
    1.44
     viac
    1.44
    POSITIVE LOGITS
     girth
    2.12
    ता
    2.06
    gimento
    2.00
    iacute
    1.94
     torna
    1.94
    𝐋
    1.90
    1.85
     Embora
    1.85
    Бо
    1.83
     goi
    1.83
    Act Density 0.000%

    No Known Activations