INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ви
    -0.09
    .portal
    -0.09
    tej
    -0.09
     æ¶
    -0.08
    agan
    -0.08
     sight
    -0.08
    poster
    -0.08
     fug
    -0.08
    625
    -0.08
    uggy
    -0.08
    POSITIVE LOGITS
    ses
    0.15
     lid
    0.15
    (ir
    0.14
     ball
    0.13
     momentum
    0.13
     conversation
    0.12
     tempo
    0.12
     initiative
    0.12
     pace
    0.12
     brakes
    0.11
    Act Density 0.097%

    No Known Activations