INDEX
Explanations
proper names or entities preceded by 'of'
the phrase "of" followed by various numerical values or counts
New Auto-Interp
Negative Logits
agre
-0.83
ahime
-0.73
condem
-0.66
arrang
-0.64
misunder
-0.63
icultural
-0.62
rina
-0.62
surpr
-0.61
fert
-0.61
hend
-0.61
POSITIVE LOGITS
teenth
0.70
ij士
0.69
dozen
0.68
teen
0.66
icial
0.65
finalists
0.63
Fulton
0.63
ources
0.63
UCLA
0.62
ife
0.59
Activations Density 0.072%