INDEX
Explanations
instances of the word "haha" at varying intensities
expressions of laughter or amusement
New Auto-Interp
Negative Logits
lining
-0.86
lines
-0.82
master
-0.77
lest
-0.73
directions
-0.73
ership
-0.69
direction
-0.68
picking
-0.67
craft
-0.66
writing
-0.66
POSITIVE LOGITS
ahah
1.12
aha
1.07
awk
0.97
ibaba
0.97
awks
0.90
HAHA
0.89
uthor
0.88
qua
0.88
ghan
0.87
UGH
0.85
Activations Density 0.014%