INDEX
Explanations
phrases related to expressing personal opinions or perspectives
New Auto-Interp
Negative Logits
themselves
-0.69
Hebdo
-0.64
dissip
-0.64
allegedly
-0.62
dissolved
-0.61
purportedly
-0.60
itself
-0.60
ousted
-0.60
Us
-0.59
emer
-0.59
POSITIVE LOGITS
myself
1.68
personally
1.04
poke
0.98
my
0.94
glad
0.89
regret
0.82
thankful
0.80
fortunate
0.79
recommend
0.78
wondering
0.76
Activations Density 6.077%