INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    why
    0.40
     зачем
    0.40
    0.38
    0.37
    为何
    0.37
    برای
    0.37
     알아
    0.37
     جهت
    0.36
    PR
    0.36
    気持ち
    0.35
    POSITIVE LOGITS
     saw
    0.98
     seen
    0.95
     gesehen
    0.88
    saw
    0.86
     видел
    0.83
     Seen
    0.82
     Saw
    0.80
    Seen
    0.80
    Saw
    0.78
     sights
    0.77
    Act Density 0.011%

    No Known Activations