INDEX
    Explanations

    taking away, robbing, stealing

    New Auto-Interp
    Negative Logits
    V
    0.66
    其他
    0.62
    रिक
    0.61
    bindung
    0.57
    вя
    0.56
    خ
    0.55
    ķ
    0.55
    Fl
    0.53
    त्य
    0.53
    kow
    0.53
    POSITIVE LOGITS
     কেড়ে
    0.72
     छीन
    0.65
    0.61
     invade
    0.59
     prodotto
    0.57
     aree
    0.57
     graze
    0.57
     robbing
    0.56
    0.55
     theft
    0.54
    Act Density 0.045%

    No Known Activations