INDEX
    Explanations

    phrases that indicate a specific action or sentiment, often related to personal experiences or opinions

    New Auto-Interp
    Negative Logits
     Glou
    -0.68
    anwhile
    -0.68
     Ambro
    -0.67
     Skydragon
    -0.65
     Thomson
    -0.65
     Sleeping
    -0.64
     Bened
    -0.63
     Ket
    -0.62
     incent
    -0.62
     Simpl
    -0.61
    POSITIVE LOGITS
    ¬
    1.29
    Ĵ
    1.15
    ħ
    1.11
    ĸ
    1.10
    ı
    1.10
    į
    1.08
    Ļ
    1.06
    Ķ
    1.04
    ¯
    1.03
    ĩ
    1.02
    Act Density 0.223%

    No Known Activations