INDEX
    Explanations

    instances of the word "can" and its variations, indicating potential or ability

    New Auto-Interp
    Negative Logits
    ufe
    -0.15
    lette
    -0.15
    �s
    -0.14
     �
    -0.14
    jos
    -0.14
    elay
    -0.14
     Olsen
    -0.13
    γι
    -0.13
    arti
    -0.13
    rost
    -0.13
    POSITIVE LOGITS
    't
    0.56
    ’t
    0.52
     neither
    0.43
    ä¸įäºĨ
    0.35
     never
    0.32
     not
    0.31
     cannot
    0.31
    ä¸įèĥ½
    0.31
     ikke
    0.30
     nicht
    0.29
    Act Density 0.135%

    No Known Activations