INDEX
    Explanations

    significant nouns, particularly those related to people’s experiences and emotions

    New Auto-Interp
    Negative Logits
    ona
    -0.17
    STE
    -0.15
    796
    -0.15
    ONA
    -0.15
     Ste
    -0.14
    ior
    -0.14
    Inlining
    -0.14
    еÑģа
    -0.14
    _tunnel
    -0.14
    TECTED
    -0.14
    POSITIVE LOGITS
    asil
    0.17
    ombat
    0.15
    Ñĩе
    0.15
    kowski
    0.15
    ùng
    0.15
     Doyle
    0.15
    ehr
    0.14
     Pell
    0.14
    veal
    0.14
    oud
    0.14
    Act Density 0.002%

    No Known Activations