INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     shipbuilding
    0.40
    tfine
    0.39
     került
    0.39
     நிற்க
    0.38
    ೇಳ
    0.38
    0.37
    ˌ
    0.37
    ています
    0.37
    くれました
    0.36
     खेळा
    0.36
    POSITIVE LOGITS
     ==
    2.42
    ==
    2.11
    =='
    1.83
    ]==
    1.80
    =="
    1.77
    ()==
    1.74
    )==
    1.68
    ']==
    1.58
     =='
    1.53
    ==",
    1.53
    Act Density 0.057%

    No Known Activations