INDEX
Explanations
proper nouns or named entities related to news stories or events
New Auto-Interp
Negative Logits
ply
-0.67
anship
-0.62
length
-0.61
lengths
-0.60
Fine
-0.60
uning
-0.58
ERAL
-0.58
ithering
-0.58
sheer
-0.58
ade
-0.57
POSITIVE LOGITS
closest
0.86
liest
0.85
iest
0.80
nearest
0.78
hardest
0.74
dstg
0.73
tallest
0.70
holiest
0.69
channelAvailability
0.68
strongest
0.67
Activations Density 3.100%