INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Cluster
    -0.07
     marriage
    -0.07
    Death
    -0.06
     террит
    -0.06
     __(
    -0.06
     obsession
    -0.06
    Ub
    -0.06
    atism
    -0.06
     kt
    -0.06
     obj
    -0.06
    POSITIVE LOGITS
     fine
    0.16
     Fine
    0.15
    Fine
    0.12
    fine
    0.12
     finest
    0.11
     finer
    0.10
    FINE
    0.10
     fined
    0.09
    INE
    0.08
    ine
    0.08
    Act Density 0.013%

    No Known Activations