INDEX
Explanations
phrases related to spoken or written remarks made by individuals
statements and comments about public figures or controversial topics
New Auto-Interp
Negative Logits
ccording
-0.70
tis
-0.66
keys
-0.66
bid
-0.65
Projects
-0.64
VID
-0.63
vu
-0.62
JV
-0.61
locked
-0.59
PG
-0.59
POSITIVE LOGITS
about
1.18
uttered
1.01
dispar
1.00
implying
0.99
aloud
0.98
regarding
0.98
ABOUT
0.89
pertaining
0.88
praising
0.87
referring
0.87
Activations Density 0.089%