INDEX
    Explanations

    action verbs followed by details or implications

    New Auto-Interp
    Negative Logits
    в
    0.48
    на
    0.46
    л
    0.45
    ك
    0.43
    لين
    0.43
    0.43
    а
    0.42
    ता
    0.42
    ts
    0.42
    ת
    0.42
    POSITIVE LOGITS
    0.49
     eV
    0.48
     বীজ
    0.47
     sikker
    0.47
    0.46
     säker
    0.44
    南北
    0.44
    highway
    0.44
     buttonAnimation
    0.44
     heller
    0.43
    Act Density 0.002%

    No Known Activations