INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     IFC
    0.52
     කියලා
    0.46
     stalwart
    0.46
    考えて
    0.46
    不过
    0.45
     equivoc
    0.45
     CIF
    0.45
    这么
    0.44
    Посилання
    0.44
     ALU
    0.44
    POSITIVE LOGITS
    <eos>
    0.43
    ati
    0.43
    ore
    0.43
    0.42
    ov
    0.42
    vant
    0.42
    ori
    0.41
    or
    0.41
    ir
    0.41
    ror
    0.40
    Act Density 0.294%

    No Known Activations