INDEX
    Explanations

    words related to the concept of "can" or ability

    New Auto-Interp
    Negative Logits
    thon
    -0.17
     thác
    -0.16
    Ïģια
    -0.15
    baÅŁ
    -0.15
    itting
    -0.15
    ITY
    -0.15
    erti
    -0.15
    ry
    -0.14
    æĺŃ
    -0.14
    oles
    -0.14
    POSITIVE LOGITS
    ing
    0.19
    woord
    0.18
    elope
    0.18
    elerik
    0.16
    ler
    0.16
    y
    0.16
     Absolute
    0.16
    yaw
    0.15
    ucket
    0.15
    uario
    0.15
    Act Density 0.034%

    No Known Activations