INDEX
Explanations
names of people from news sources or interviews
mentions of a specific person in a context related to news or interviews
New Auto-Interp
Negative Logits
tremend
-0.95
satell
-0.81
eatures
-0.79
empt
-0.75
-0.72
wcs
-0.71
challeng
-0.71
cumbers
-0.71
thirst
-0.70
stagn
-0.70
POSITIVE LOGITS
Yeah
1.43
Exactly
1.39
Explain
1.35
Well
1.28
Alright
1.24
Absolutely
1.20
Okay
1.18
Right
1.17
Sure
1.13
Correct
1.12
Activations Density 0.037%