INDEX
Explanations
words related to current events and political controversies
New Auto-Interp
Negative Logits
ties
-0.68
makeshift
-0.63
retri
-0.63
Ͻ
-0.61
perspect
-0.61
mate
-0.59
isolation
-0.59
graded
-0.58
contingent
-0.57
hindsight
-0.57
POSITIVE LOGITS
Continue
0.99
Advertisement
0.98
advertisement
0.83
Skip
0.77
Advertisement
0.73
advertising
0.73
ieu
0.72
usercontent
0.70
ADVERTISEMENT
0.70
sburg
0.69
Activations Density 0.281%