INDEX
Explanations
text related to providing guidance or explanations
phrases that indicate guidance or instructional content
New Auto-Interp
Negative Logits
Virgin
-0.72
ussia
-0.70
Arabs
-0.69
ãĥ©ãĥ³
-0.68
Jews
-0.67
Islamic
-0.66
Muslims
-0.65
Nazis
-0.64
ansas
-0.63
disapprove
-0.62
POSITIVE LOGITS
infographic
0.90
Fortunately
0.87
Luckily
0.83
Luckily
0.83
ado
0.78
Fortunately
0.77
Thankfully
0.76
luckily
0.76
Thankfully
0.75
fortunately
0.74
Activations Density 0.555%