INDEX
Explanations
concepts related to self-awareness and interconnectedness
New Auto-Interp
Negative Logits
337
-0.17
Zucker
-0.15
iew
-0.15
egg
-0.14
igen
-0.14
Heller
-0.14
834
-0.14
igs
-0.14
inde
-0.14
336
-0.14
POSITIVE LOGITS
Krish
0.21
Conditioning
0.17
ayo
0.17
Brock
0.15
flowering
0.15
Ved
0.15
飯
0.15
abra
0.15
Educational
0.15
sir
0.14
Activations Density 0.006%