INDEX
    Explanations

    fascism, totalitarianism, authoritarianism

    New Auto-Interp
    Negative Logits
    ]++;
    0.92
     anima
    0.89
    𝙽
    0.87
     counterfeit
    0.87
    0.86
    CHARGE
    0.86
     cardiomyocytes
    0.85
     dirigida
    0.85
     conjunction
    0.84
     comentar
    0.83
    POSITIVE LOGITS
    ک
    0.79
    𝙚
    0.77
    ير
    0.74
    <bos>
    0.74
    ര്‍ക്കും
    0.72
    kort
    0.71
     dictators
    0.69
     regime
    0.69
    ƅ
    0.68
     Batterie
    0.68
    Act Density 0.117%

    No Known Activations