INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    =".
    -0.42
    __::
    -0.41
    (::
    -0.40
    ThroughAttribute
    -0.39
     CCS
    -0.39
    -0.39
    __":
    
    -0.37
    |')
    -0.37
    .');
    -0.37
    °.
    -0.36
    POSITIVE LOGITS
     Trump
    2.44
    Trump
    2.28
    trump
    1.73
     trump
    1.66
    特朗普
    1.38
     ترام
    1.26
     Donald
    1.05
     Trumpet
    1.01
     Biden
    0.98
     trumpet
    0.97
    Act Density 0.004%

    No Known Activations