INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     '}';
    0.39
     dehuman
    0.38
     ተግባ
    0.38
     extruder
    0.38
     외부
    0.37
                 
    0.37
     కంటే
    0.36
     chronically
    0.36
     الجه
    0.36
    STTS
    0.36
    POSITIVE LOGITS
    8
    0.57
    6
    0.57
    7
    0.55
    5
    0.55
    3
    0.54
    9
    0.52
    info
    0.51
    js
    0.50
    ama
    0.48
    2
    0.48
    Act Density 4.127%

    No Known Activations