INDEX
Explanations
information related to safety procedures or guidelines
New Auto-Interp
Negative Logits
-0.75
Bashar
-0.75
Maduro
-0.75
Confederacy
-0.73
mutants
-0.70
presided
-0.70
Murdoch
-0.70
Plaint
-0.69
iannopoulos
-0.69
riots
-0.69
POSITIVE LOGITS
beginner
1.25
beginners
1.14
Your
1.06
your
1.04
Tips
1.02
Helpful
1.02
Learn
1.02
Tips
1.00
Yourself
0.98
your
0.97
Activations Density 3.554%