INDEX
Explanations
phrases related to quotes or statements made by various individuals or entities
mentions of news organizations or press sources
New Auto-Interp
Negative Logits
diaper
-0.60
veter
-0.56
atible
-0.54
doesnt
-0.54
viz
-0.53
favourites
-0.52
animate
-0.51
precon
-0.51
harms
-0.50
unfocusedRange
-0.49
POSITIVE LOGITS
.
0.93
.</
0.81
."
0.77
.).
0.75
sarcast
0.73
."
0.70
quoted
0.70
lied
0.69
.]
0.69
rhet
0.69
Activations Density 0.196%