INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ان
    2.17
    ции
    1.89
    ز
    1.84
    ف
    1.79
    Mein
    1.66
    1.64
    GS
    1.57
    ר
    1.57
    FT
    1.56
    ください
    1.56
    POSITIVE LOGITS
    works
    1.91
     dulce
    1.90
    difficult
    1.80
    colours
    1.80
    melon
    1.78
    ння
    1.77
    corollary
    1.73
    í
    1.69
    ab
    1.68
     Assess
    1.67
    Act Density 0.040%

    No Known Activations