INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _two
    -0.07
    coon
    -0.06
    _nom
    -0.06
     chew
    -0.06
     succeeded
    -0.06
    ?)↵
    -0.06
     haunting
    -0.06
    _three
    -0.06
     Exhibition
    -0.06
    _car
    -0.06
    POSITIVE LOGITS
    Outside
    0.08
     Outside
    0.07
    ilog
    0.06
    wayne
    0.06
    0.06
    Republic
    0.06
     Republic
    0.06
    лося
    0.06
     Beat
    0.06
    yard
    0.06
    Act Density 0.004%

    No Known Activations