INDEX
Negative Logits
explanations
1.42
explanation
1.36
Explanation
1.35
Explanation
1.34
explanation
1.33
explicação
1.31
explicación
1.22
Explained
1.13
explaining
1.10
Explain
1.03
POSITIVE LOGITS
tell
1.09
guys
0.88
indulge
0.87
mind
0.86
remind
0.83
cheeky
0.81
selfish
0.81
maybe
0.80
giggle
0.79
Tell
0.76
Activations Density 0.065%