INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Houſe
    -0.59
     becauſe
    -0.57
     ſeveral
    -0.57
     ſta
    -0.57
     ſmall
    -0.56
     Efq
    -0.56
     occaf
    -0.56
     nahilalakip
    -0.56
     iſt
    -0.54
     $=-
    -0.54
    POSITIVE LOGITS
    <bos>
    0.72
     مشارکت‌کنندگان
    0.51
    setHorizontal
    0.44
     női
    0.43
    onAttach
    0.43
     dönt
    0.42
    dom
    0.42
    top
    0.42
    doms
    0.42
    YNC
    0.42
    Act Density 0.010%

    No Known Activations