INDEX
Explanations
references to quantities or numerical values
phrases that describe quantities or relationships using "of" in various contexts
New Auto-Interp
Negative Logits
agre
-0.75
encour
-0.69
icultural
-0.63
streng
-0.63
esting
-0.63
confir
-0.62
misunder
-0.62
sensit
-0.60
performance
-0.59
disapp
-0.58
POSITIVE LOGITS
teenth
0.75
teen
0.73
ij士
0.69
those
0.66
eely
0.64
icial
0.63
ife
0.63
them
0.63
ridor
0.62
ARE
0.62
Activations Density 0.068%