INDEX
Explanations
words related to expressing opinions or views
New Auto-Interp
Negative Logits
consecut
-0.67
eri
-0.67
ammy
-0.66
trap
-0.66
vous
-0.65
amn
-0.65
esses
-0.64
ilant
-0.63
eff
-0.63
Explos
-0.62
POSITIVE LOGITS
views
1.03
opinions
0.95
viewpoints
0.88
finder
0.87
opin
0.85
opinion
0.84
topic
0.82
viewpoint
0.82
»
0.82
expressed
0.81
Activations Density 0.890%