INDEX
    Explanations

    expressions related to interest or curiosity

    New Auto-Interp
    Negative Logits
     Caw
    -0.57
     utafitiHapana
    -0.50
     Goy
    -0.50
    Rüyada
    -0.49
    RetentionPolicy
    -0.49
     giù
    -0.48
     glyph
    -0.47
     jaws
    -0.46
     Lawton
    -0.46
    tasche
    -0.46
    POSITIVE LOGITS
     Interest
    1.04
    Interest
    0.94
    interest
    0.91
     interest
    0.89
     Interests
    0.84
    INTEREST
    0.77
    interested
    0.77
     INTEREST
    0.76
     interests
    0.73
     Interested
    0.73
    Act Density 0.105%

    No Known Activations