INDEX
Explanations
references to the concept of "Thinking" or reflecting
prompts that encourage reflection or consideration
New Auto-Interp
Negative Logits
cffffcc
-0.71
iqueness
-0.71
Adin
-0.69
taboola
-0.67
GMT
-0.66
conservancy
-0.64
Interstitial
-0.64
recorded
-0.62
CBC
-0.62
Merit
-0.62
POSITIVE LOGITS
ileaks
0.77
ative
0.72
notations
0.71
iste
0.70
lahoma
0.70
pad
0.70
onymous
0.69
ventus
0.69
eport
0.68
itia
0.67
Activations Density 0.033%