INDEX
    Explanations

    words starting with a letter

    New Auto-Interp
    Negative Logits
     ebenfalls
    0.41
     également
    0.40
     وكذلك
    0.38
     tillegg
    0.37
     পরিবর্তিত
    0.36
     হইলেও
    0.35
     tambahan
    0.35
     dulu
    0.35
    ब्रेरी
    0.35
     migliorare
    0.35
    POSITIVE LOGITS
    4
    0.50
    0.45
    0.45
    5
    0.45
    1
    0.44
    8
    0.44
    0.43
    0.42
    0.42
    6
    0.42
    Act Density 0.012%

    No Known Activations