INDEX
Explanations
phrases consisting of a word followed by " he said."
statements and quotations from speakers
New Auto-Interp
Negative Logits
ãĥİ
-0.79
shitty
-0.61
animate
-0.60
ãĥ¼ãĥĨãĤ£
-0.60
Birthday
-0.59
crappy
-0.59
magically
-0.58
Kardash
-0.57
EVERY
-0.56
steroids
-0.56
POSITIVE LOGITS
spokeswoman
0.78
ulty
0.77
spokesman
0.75
20439
0.74
Reuters
0.70
cited
0.69
iannopoulos
0.68
Cheong
0.67
scathing
0.67
IDER
0.66
Activations Density 0.424%