INDEX
Explanations
references to the Guardian newspaper
New Auto-Interp
Negative Logits
cles
-0.87
merce
-0.85
itionally
-0.84
perm
-0.83
lease
-0.79
ĸļ
-0.77
perature
-0.77
rano
-0.73
zinski
-0.72
icka
-0.72
POSITIVE LOGITS
Angels
1.06
Angel
0.91
ãĥķãĤ¡
0.86
Editorial
0.83
Islands
0.81
Sentinel
0.80
Observer
0.80
Agency
0.79
Newspaper
0.74
Newsp
0.74
Activations Density 0.019%