INDEX
Explanations
numerical representations or mentions of age
New Auto-Interp
Negative Logits
es
-0.87
Orrell
-0.83
IBinder
-0.74
</strong>
-0.69
/
-0.66
)
-0.64
er
-0.63
car
-0.62
def
-0.59
“
-0.59
POSITIVE LOGITS
eighty
2.10
seventy
2.08
sixty
2.04
fifty
2.00
ninety
2.00
forty
1.98
thirty
1.97
twenty
1.95
twenty
1.80
forty
1.76
Activations Density 0.164%