INDEX
Explanations
emotions and expressions related to societal issues and personal reflections
The token "lol" or similar internet slang
expressions of amusement
New Auto-Interp
Negative Logits
ſind
-0.90
iſt
-0.86
་་
-0.85
'\\;'
-0.77
ſi
-0.74
ſever
-0.74
quæ
-0.73
ſeveral
-0.73
myſelf
-0.72
―――――
-0.72
POSITIVE LOGITS
lmao
0.78
ironically
0.77
<eos>
0.76
↵↵
0.71
lol
0.68
fuck
0.68
meme
0.67
rekt
0.67
Lmao
0.66
kek
0.66
Activations Density 0.259%