INDEX
    Explanations

    journey, activity, time, crypto, money

    New Auto-Interp
    Negative Logits
     
    0.66
     University
    0.55
     H
    0.54
     trumpet
    0.53
     the
    0.52
     års
    0.52
     chocolate
    0.50
     venerable
    0.50
     L
    0.50
     I
    0.50
    POSITIVE LOGITS
    0.67
    并且
    0.61
    0.55
    0.54
    whereas
    0.53
     voordat
    0.53
    0.53
     aswell
    0.52
    而且
    0.50
    由于
    0.50
    Act Density 0.001%

    No Known Activations