INDEX
    Explanations

    occurrences of the word "This" or related phrases indicating emphasis on specific points or details

    New Auto-Interp
    Negative Logits
    idis
    -0.17
    stan
    -0.16
     Dare
    -0.15
    ille
    -0.15
    illes
    -0.15
    IPH
    -0.14
    esta
    -0.14
    annes
    -0.14
    arily
    -0.14
    aved
    -0.14
    POSITIVE LOGITS
    _PADDING
    0.16
    chio
    0.15
    oard
    0.14
    andelier
    0.14
    ayet
    0.14
    /stretch
    0.14
    rack
    0.14
    _squared
    0.14
    )((((
    0.14
    -prepend
    0.13
    Act Density 0.117%

    No Known Activations