INDEX
Explanations
mentions of complaints or complicated issues
instances of the word "complaint" and its variations
New Auto-Interp
Negative Logits
EStream
-0.79
selection
-0.78
bill
-0.76
ctors
-0.75
»Ĵ
-0.74
doors
-0.73
plane
-0.70
ggies
-0.70
hyde
-0.69
garlic
-0.68
POSITIVE LOGITS
Compl
1.14
Compl
0.90
icit
0.86
aints
0.83
aint
0.81
ustration
0.79
ications
0.79
icated
0.78
acent
0.77
Puzz
0.77
Activations Density 0.009%