INDEX
    Explanations

    names of specific entities or titles, indicating a focus on notable people, places, or concepts

    New Auto-Interp
    Negative Logits
     صوتيه
    -1.05
     Waray
    -1.03
     виправивши
    -0.99
    ValueStyle
    -0.94
     المعيارى
    -0.85
     utafitiHapana
    -0.85
    Демографія
    -0.83
    DockStyle
    -0.82
    BASEPATH
    -0.81
    GenerationType
    -0.80
    POSITIVE LOGITS
    0.67
    ↵↵
    0.65
    <eos>
    0.59
    ↵↵↵
    0.53
    <code>
    0.51
    ).
    0.50
     irony
    0.49
    0.48
     beets
    0.48
    <em>
    0.47
    Act Density 0.403%

    No Known Activations