INDEX
Explanations
phrases indicating moral or ethical teachings
New Auto-Interp
Negative Logits
deen
-0.15
TintColor
-0.15
eft
-0.15
rowsable
-0.14
йн
-0.14
Norm
-0.14
amy
-0.14
Norm
-0.14
AIT
-0.14
assignable
-0.13
POSITIVE LOGITS
Wisdom
0.15
539
0.15
Pro
0.15
Richt
0.15
交
0.15
Bias
0.14
Prov
0.14
CPR
0.14
wisdom
0.14
affles
0.14
Activations Density 0.096%