INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ک
    0.84
    لا
    0.79
    ق
    0.74
    ری
    0.74
    rägt
    0.70
    پ
    0.69
    人都
    0.69
    0.69
     кра
    0.68
    u
    0.68
    POSITIVE LOGITS
    NG
    0.81
    মিক্যাল
    0.81
    вары
    0.80
    лад
    0.77
    hotel
    0.75
     hoteles
    0.75
    енты
    0.74
     caucasian
    0.73
    льности
    0.73
    ОР
    0.73
    Act Density 0.002%

    No Known Activations