INDEX
    Explanations

    leading to choice or consequence

    New Auto-Interp
    Negative Logits
     noodles
    0.39
     हालांकि
    0.39
     bristles
    0.39
     oats
    0.38
     doubts
    0.38
     bearings
    0.38
     zigzag
    0.38
     knuckles
    0.38
     analogs
    0.38
     novices
    0.37
    POSITIVE LOGITS
     telah
    0.67
    选择了
    0.63
    把它
    0.55
     इन्होंने
    0.54
     выбрали
    0.54
     сделали
    0.52
     offenbar
    0.51
     hayan
    0.48
     उन्‍ह
    0.48
     Somehow
    0.48
    Act Density 0.040%

    No Known Activations