INDEX
Explanations
references to magazines and their sections
magazine publications
New Auto-Interp
Negative Logits
4
-0.48
helping
-0.47
Hull
-0.47
Biden
-0.45
ERTY
-0.45
sleepy
-0.45
this
-0.45
Biden
-0.43
sorry
-0.42
Protect
-0.42
POSITIVE LOGITS
magazine
1.24
magazines
1.20
Magazine
1.12
Magazines
1.09
magazine
1.08
MAGAZINE
1.02
Magazine
1.02
MAGAZINE
0.87
magaz
0.86
Magaz
0.86
Activations Density 0.005%