INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    pet
    -0.07
    .validator
    -0.06
     horn
    -0.06
    2
    -0.06
     nick
    -0.06
     Departments
    -0.06
     prose
    -0.06
     Point
    -0.06
     Tor
    -0.06
    öyle
    -0.06
    POSITIVE LOGITS
    .Now
    0.07
    Wa
    0.07
    esting
    0.06
    ...");↵↵
    0.06
    .small
    0.06
    _aligned
    0.06
    ='<?
    0.06
    βολή
    0.06
    dad
    0.06
    TRANS
    0.06
    Act Density 0.018%

    No Known Activations