INDEX
Explanations
quotations or reported speech
reported speech or statements made by individuals
New Auto-Interp
Negative Logits
Klux
-0.75
Republican
-0.71
DonaldTrump
-0.70
Attorney
-0.65
mathemat
-0.64
Nor
-0.62
ocide
-0.62
barred
-0.62
impe
-0.61
punish
-0.61
POSITIVE LOGITS
:"
1.20
"[
0.89
:[
0.85
goodbye
0.84
"â̦
0.83
:
0.80
"...
0.80
""
0.79
"(
0.78
hello
0.74
Activations Density 0.200%