INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ل
    0.91
    "
    0.86
    v
    0.82
     destac
    0.79
    ا
    0.79
    ı
    0.77
     வகை
    0.76
     wasteland
    0.75
    um
    0.74
     {//
    0.73
    POSITIVE LOGITS
    ای
    0.99
    0.92
     Cooperation
    0.91
     Zusammenarbeit
    0.86
    ות
    0.85
     cooperation
    0.81
    q
    0.80
    way
    0.78
    ust
    0.77
    위를
    0.77
    Act Density 0.004%

    No Known Activations