INDEX
    Explanations

    Hebrew characters or words

    non-English characters or symbols

    New Auto-Interp
    Negative Logits
    ovie
    -0.93
    gdala
    -0.85
    ooth
    -0.81
    holes
    -0.77
     newsp
    -0.75
    ellen
    -0.73
    essa
    -0.71
    sonian
    -0.70
     Flavoring
    -0.69
    leground
    -0.68
    POSITIVE LOGITS
    à¤
    1.38
    Ĺ
    1.25
    ×
    1.13
    ¤
    1.11
    ×ķ
    1.10
    ×Ļ
    1.07
     à¤
    1.04
    ĵ
    1.04
    ķ
    1.04
    ¢
    1.02
    Act Density 0.001%

    No Known Activations