INDEX
Explanations
mentions of new or recently introduced things
references to new or recently introduced items or concepts
New Auto-Interp
Negative Logits
peanuts
-0.73
cius
-0.67
BILITY
-0.67
advised
-0.66
depress
-0.63
beh
-0.62
gru
-0.61
microw
-0.61
chnology
-0.60
Zip
-0.60
POSITIVE LOGITS
foundland
1.23
bies
1.21
theless
0.96
new
0.95
bie
0.92
etheless
0.89
tons
0.88
fw
0.87
YORK
0.86
lisher
0.83
Activations Density 0.009%