INDEX
Explanations
references to aging or things labeled as "old."
"old" before other words
old concepts and items
New Auto-Interp
Negative Logits
possano
-0.67
hilsen
-0.65
agoza
-0.64
drawable
-0.63
Praze
-0.62
Himo
-0.62
ViewFeatures
-0.62
esgue
-0.61
췄
-0.61
butterflies
-0.60
POSITIVE LOGITS
fashioned
1.23
OLD
1.06
old
1.04
Old
1.01
Old
0.95
vieux
0.87
fashioned
0.85
vieja
0.81
vecchio
0.81
viejo
0.80
Activations Density 0.064%