INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     aven
    -0.07
    lyph
    -0.07
    rightness
    -0.07
     noodles
    -0.07
     retrieve
    -0.07
     manipulate
    -0.07
    TIN
    -0.07
     instantaneous
    -0.07
    TEM
    -0.07
    ાજ
    -0.07
    POSITIVE LOGITS
    培训
    0.13
     educating
    0.11
     تدريب
    0.11
     opleiding
    0.11
    0.10
     workshops
    0.10
     briefing
    0.10
     प्रशिक्षण
    0.10
     التدريب
    0.10
     Workshops
    0.09
    Act Density 0.055%

    No Known Activations