INDEX
    Explanations

    various forms and derivatives of the word "fire."

    New Auto-Interp
    Negative Logits
    ĸļ
    -0.79
    sylv
    -0.77
    iple
    -0.73
    onent
    -0.72
    ociated
    -0.69
    imore
    -0.69
    insula
    -0.67
    ermott
    -0.66
    iances
    -0.66
    acca
    -0.66
    POSITIVE LOGITS
    lli
    1.35
    nces
    1.21
    tta
    1.17
    tto
    1.13
    lda
    1.05
    nder
    1.04
    lla
    1.02
    llo
    0.99
    nce
    0.98
    ttes
    0.96
    Act Density 0.012%

    No Known Activations