INDEX
    Explanations

    phrases related to the concept of 'origin' or beginnings

    New Auto-Interp
    Negative Logits
    ington
    -0.18
    iro
    -0.15
    ew
    -0.14
    alar
    -0.14
    °
    -0.14
    erence
    -0.14
    ,
    -0.14
     if
    -0.13
     at
    -0.13
    erman
    -0.13
    POSITIVE LOGITS
    ONGL
    0.17
    forge
    0.15
    YNC
    0.15
    entially
    0.15
    /source
    0.15
    arily
    0.15
    dden
    0.15
    obuf
    0.15
    ummings
    0.15
    ãĥĭãĤ¢
    0.15
    Act Density 0.028%

    No Known Activations