INDEX
    Explanations

    specific names or terms that refer to people or characters

    New Auto-Interp
    Negative Logits
    hurst
    -0.16
    &&&&
    -0.15
    izard
    -0.15
    afen
    -0.14
    addy
    -0.14
    hof
    -0.14
    ifton
    -0.14
    eddar
    -0.14
    hevik
    -0.14
    lint
    -0.14
    POSITIVE LOGITS
    ucc
    0.17
    REW
    0.16
     Richards
    0.15
     tart
    0.15
     val
    0.15
     diff
    0.14
     fog
    0.14
     mond
    0.14
     Dag
    0.14
    essel
    0.14
    Act Density 0.003%

    No Known Activations