INDEX
Explanations
phrases related to controversial or sensational news stories
sentences or statements that discuss significant events or issues
New Auto-Interp
Negative Logits
defe
-0.86
bud
-0.74
bounded
-0.72
affili
-0.69
appra
-0.67
medium
-0.66
clin
-0.65
dep
-0.65
uncom
-0.65
rebirth
-0.64
POSITIVE LOGITS
Similarly
1.19
Needless
1.14
Likewise
1.12
Apparently
1.09
Specifically
1.07
Previously
1.05
Interestingly
1.05
Similar
1.04
Whereas
1.04
Ironically
1.03
Activations Density 0.757%