INDEX
    Explanations

    references to figures or numeric values

    New Auto-Interp
    Negative Logits
    rich
    -0.17
    lok
    -0.16
    ylon
    -0.16
    rien
    -0.16
    uga
    -0.15
    wich
    -0.14
     Moody
    -0.14
    mouth
    -0.14
    rne
    -0.14
    eties
    -0.14
    POSITIVE LOGITS
    head
    0.19
    .fig
    0.19
    tte
    0.17
    heads
    0.17
    inth
    0.16
    RED
    0.15
    headed
    0.15
    uration
    0.15
    oft
    0.15
     prominently
    0.14
    Act Density 0.038%

    No Known Activations