INDEX
    Explanations

    phrases that convey subjective evaluations or characteristics of subjects

    New Auto-Interp
    Negative Logits
     estekak
    -0.57
     Judea
    -0.55
    ientôt
    -0.50
     Fußballspieler
    -0.49
    تقاوى
    -0.49
    IsMutable
    -0.48
    tvguidetime
    -0.47
     $_"
    -0.47
    /*
    -0.46
     تضيفلها
    -0.46
    POSITIVE LOGITS
     WHICH
    0.54
     which
    0.49
     Which
    0.49
     vilket
    0.46
    which
    0.44
    Which
    0.43
     pretty
    0.41
     což
    0.38
     vilken
    0.38
     hvilket
    0.38
    Act Density 0.463%

    No Known Activations