INDEX
Explanations
phrases related to awareness and consciousness regarding facts or information
New Auto-Interp
Negative Logits
Mu
-0.66
Biggs
-0.66
IndentedString
-0.65
yong
-0.65
://
-0.63
din
-0.60
Schultz
-0.60
walde
-0.59
Mu
-0.59
QUADS
-0.59
POSITIVE LOGITS
aware
2.21
Aware
2.09
awareness
2.05
awareness
1.93
aware
1.93
Awareness
1.86
Awareness
1.82
Aware
1.58
unaware
1.45
consciente
1.34
Activations Density 0.062%