INDEX
Explanations
news headlines indicating urgency or importance, often accompanied by a command to watch or read
urgent or emphatic directives
New Auto-Interp
Negative Logits
76561
-0.80
Hamp
-0.71
Vish
-0.71
Bhar
-0.69
gee
-0.68
bats
-0.66
itar
-0.66
gdala
-0.65
Catalyst
-0.65
Chak
-0.64
POSITIVE LOGITS
WATCH
0.88
ered
0.82
ICLE
0.82
LECT
0.79
ARD
0.78
ering
0.77
nyder
0.77
comply
0.77
obey
0.74
ORTS
0.73
Activations Density 0.012%