INDEX
Explanations
names of individuals in legal contexts
punctuation, specifically commas
New Auto-Interp
Negative Logits
stereotypes
-0.69
FIX
-0.69
tremend
-0.69
dreams
-0.67
predictions
-0.65
incon
-0.64
Availability
-0.64
prompt
-0.64
animals
-0.63
deficit
-0.63
POSITIVE LOGITS
Jr
1.18
Sr
1.09
tein
1.01
Jr
0.99
aka
0.95
LLP
0.92
QC
0.80
Calif
0.78
Es
0.76
supra
0.75
Activations Density 0.678%