INDEX
    Explanations

    interests and specific experiences

    New Auto-Interp
    Negative Logits
     already
    -1.14
    aj
    -1.10
    んですか
    -1.00
     basadas
    -0.98
     wobec
    -0.98
     хочу
    -0.96
    futbolista
    -0.96
     цветок
    -0.94
    已经
    -0.90
    ↵↵↵↵
    -0.89
    POSITIVE LOGITS
     many
    1.23
    Especially
    1.22
     was
    1.21
     שנים
    1.16
     especially
    1.14
     alltid
    1.10
    until
    1.06
    czerw
    1.06
     especiais
    1.05
    pewa
    1.05
    Act Density 0.011%

    No Known Activations