INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    pled
    -0.08
    thern
    -0.08
     Liam
    -0.08
     müm
    -0.07
     Tomb
    -0.07
    ische
    -0.07
    ischer
    -0.07
    Jennifer
    -0.07
     Pom
    -0.07
    len
    -0.07
    POSITIVE LOGITS
     ACT
    0.13
     act
    0.13
    Act
    0.12
     Act
    0.12
    act
    0.12
    ACT
    0.10
     acts
    0.10
    .act
    0.09
    Acts
    0.09
    _Act
    0.08
    Act Density 0.013%

    No Known Activations