INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     smart
    -1.84
     Smart
    -1.75
    Smart
    -1.74
    smart
    -1.58
    SMART
    -1.34
     SMART
    -1.29
     smartest
    -1.10
    スマート
    -1.02
     smarter
    -0.98
     inteligentes
    -0.93
    POSITIVE LOGITS
    ness
    0.75
    watches
    0.63
    nesses
    0.61
    eners
    0.61
    est
    0.60
    ings
    0.59
    ments
    0.58
    Rujuakan
    0.56
    enment
    0.56
    shields
    0.54
    Act Density 0.085%

    No Known Activations