INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    د
    0.71
    ب
    0.63
    م
    0.57
    as
    0.54
    0.53
    氛围
    0.50
     Verbesserung
    0.50
    B
    0.48
    0.46
    in
    0.46
    POSITIVE LOGITS
     आदमी
    0.84
    hombre
    0.78
     man
    0.78
     uomo
    0.77
     رجل
    0.77
     человеку
    0.77
     pria
    0.74
     čovjek
    0.74
     uomini
    0.73
     hombre
    0.73
    Act Density 0.068%

    No Known Activations