INDEX
Explanations
keywords related to generalizations or inclusive statements
references to "everyone."
New Auto-Interp
Negative Logits
anga
-0.68
QL
-0.64
vern
-0.61
pose
-0.60
bal
-0.58
testament
-0.57
guiActiveUnfocused
-0.57
eeds
-0.56
eder
-0.56
duc
-0.56
POSITIVE LOGITS
else
1.45
THING
1.03
Else
0.99
imaginable
0.93
else
0.92
Else
0.82
conceivable
0.81
ãĤ§
0.77
who
0.76
igator
0.76
Activations Density 0.040%