INDEX
Explanations
instances of the word "Chart", with a preference for higher activations
occurrences of the word "chart" and its variations
New Auto-Interp
Negative Logits
Romeo
-0.68
ignt
-0.66
vae
-0.64
IRE
-0.61
Paulo
-0.60
Cic
-0.60
Chinatown
-0.59
LIMITED
-0.59
cknow
-0.58
Goodman
-0.57
POSITIVE LOGITS
ered
1.32
ued
1.06
ing
0.90
erer
0.90
uing
0.86
ed
0.86
erers
0.84
chart
0.82
icle
0.80
eer
0.80
Activations Density 0.033%