INDEX
Explanations
proper nouns and names
references to specific individuals and their affiliations or attributions
New Auto-Interp
Negative Logits
Hare
-0.67
uyomi
-0.66
Yak
-0.62
vik
-0.61
Disciple
-0.60
cav
-0.59
gravy
-0.58
Rivals
-0.58
fruitful
-0.58
Dino
-0.57
POSITIVE LOGITS
]);
0.93
});
0.84
NPR
0.81
});
0.77
)</
0.75
Photography
0.73
));
0.73
)."
0.72
));
0.68
%).
0.67
Activations Density 0.106%