INDEX
Explanations
important concepts related to critiques of societal norms, particularly focusing on hypocrisy and the expectations placed on individuals
New Auto-Interp
Negative Logits
cave
-0.16
miscon
-0.15
aspirations
-0.15
premises
-0.15
yles
-0.15
answers
-0.14
ives
-0.14
motives
-0.14
merits
-0.14
ptions
-0.14
POSITIVE LOGITS
item
0.36
feature
0.34
attribute
0.31
detail
0.30
thing
0.30
element
0.29
piece
0.28
aspect
0.28
issue
0.28
feature
0.27
Activations Density 1.060%