INDEX
    Explanations

    expressions or phrases that indicate observation or perception

    New Auto-Interp
    Negative Logits
     rated
    -0.32
    bestimmungen
    -0.32
     mourut
    -0.31
     Bikin
    -0.30
     Eq
    -0.29
    abito
    -0.28
     عليك
    -0.28
    Tutto
    -0.27
    と思ったら
    -0.27
    standig
    -0.27
    POSITIVE LOGITS
    idać
    0.80
     snippetHide
    0.72
    OGND
    0.68
    SequentialGroup
    0.65
    Obvious
    0.65
    ReusableCell
    0.63
    Anhalt
    0.62
    存于互联网档案馆
    0.62
     widać
    0.60
     unſer
    0.60
    Act Density 0.028%

    No Known Activations