INDEX
Explanations
phrases related to casual conversation and social interactions
conversational expressions and interjections
New Auto-Interp
Negative Logits
sequently
-0.66
ourses
-0.66
Q
-0.65
ourse
-0.64
ilst
-0.64
20439
-0.64
avering
-0.63
etermined
-0.63
egu
-0.62
sufficient
-0.62
POSITIVE LOGITS
freaking
0.96
fucking
0.94
goddamn
0.94
crappy
0.94
damn
0.92
shitty
0.92
nerds
0.92
kidding
0.92
dudes
0.90
crap
0.89
Activations Density 1.502%