INDEX
Explanations
words related to intense emotions or actions, such as 'detector', 'outrage', 'genius', 'inexplicable', 'love', 'trans', 'giggles', 'sly', 'bored', 'quick-trigger'
expressions related to humor and comedic elements
New Auto-Interp
Negative Logits
accept
-0.62
conduc
-0.62
Sah
-0.61
ourses
-0.61
PTS
-0.61
offshore
-0.61
leasing
-0.60
challeng
-0.59
arsen
-0.59
âĵĺ
-0.58
POSITIVE LOGITS
pants
1.07
chuckle
1.05
understatement
1.02
sarc
1.01
irony
0.98
grin
0.97
nerd
0.97
laughter
0.95
jokes
0.94
memes
0.94
Activations Density 0.774%