INDEX
    Explanations

    numbers, ratios, and units

    New Auto-Interp
    Negative Logits
    DIFF
    0.38
    killing
    0.38
    0.37
    BorderStyle
    0.37
    щает
    0.36
     wskaz
    0.36
    0.36
     عمليه
    0.36
    0.36
    lenie
    0.36
    POSITIVE LOGITS
    ێن
    0.42
     adopted
    0.39
    0.39
     ऑड
    0.38
     മൊ
    0.38
    arab
    0.38
    0.38
     pemer
    0.37
     Miss
    0.36
     muss
    0.36
    Act Density 0.003%

    No Known Activations