INDEX
    Explanations

    instances of the word "ha," indicating laughter or amusement

    New Auto-Interp
    Negative Logits
    SpringRunner
    -0.53
    tidumbre
    -0.51
    wpi
    -0.50
    																					
    -0.49
     Dunkel
    -0.49
    mtr
    -0.48
     oner
    -0.47
     שלו
    -0.46
    gerald
    -0.46
    pozorn
    -0.46
    POSITIVE LOGITS
    ha
    2.31
    HA
    1.49
    Ha
    1.44
     ha
    1.43
     Ha
    1.30
     HA
    1.03
    ха
    0.91
    0.88
    हा
    0.87
    haa
    0.82
    Act Density 0.008%

    No Known Activations