INDEX
    Explanations

    cover/identify/exceeded expectations

    New Auto-Interp
    Negative Logits
    ˌ
    0.95
     phức
    0.94
     multidis
    0.94
     equilíbrio
    0.91
    icionados
    0.87
     decept
    0.86
    ahang
    0.86
     impecc
    0.85
     prestige
    0.84
     wład
    0.84
    POSITIVE LOGITS
    и
    1.42
    1.33
    ر
    1.30
    そして
    1.26
     וה
    1.19
    י
    1.19
    ি
    1.17
    ن
    1.15
     وتح
    1.13
     ومن
    1.12
    Act Density 0.001%

    No Known Activations