INDEX
    Explanations

    foreign language characters

    New Auto-Interp
    Negative Logits
     This
    0.38
     While
    0.37
     We
    0.35
     Although
    0.35
     Quatre
    0.34
     Despite
    0.34
     Trois
    0.34
    <unused2173>
    0.34
     dez
    0.33
     Holl
    0.33
    POSITIVE LOGITS
    ಜ್ಞ
    0.41
    ън
    0.37
    0.35
    дің
    0.35
     Интере
    0.35
    იან
    0.34
    есть
    0.34
     самим
    0.34
    clusión
    0.34
    лада
    0.34
    Act Density 0.212%

    No Known Activations