INDEX
Explanations
mentions of media organizations or news events
references to popular television shows or significant cultural events
New Auto-Interp
Negative Logits
THEN
-0.55
Intermediate
-0.55
"},{"-0.54
ļéĨĴ
-0.52
then
-0.51
lication
-0.49
laun
-0.49
atta
-0.49
apo
-0.49
OTAL
-0.48
POSITIVE LOGITS
lately
1.35
since
0.92
steadily
0.87
since
0.82
recent
0.80
Recent
0.69
consistently
0.68
recently
0.67
pmwiki
0.66
countless
0.65
Activations Density 1.081%