INDEX
Explanations
phrases related to numerical quantities and distinct entities
references to quantifiable entities or categories
New Auto-Interp
Negative Logits
brance
-0.81
heit
-0.78
matter
-0.72
govtrack
-0.71
culosis
-0.69
etermination
-0.67
abad
-0.66
003
-0.64
otto
-0.64
life
-0.64
POSITIVE LOGITS
aforementioned
0.95
pillars
0.91
platforms
0.83
ses
0.81
avenues
0.81
eras
0.81
bedrooms
0.80
earliest
0.79
extremes
0.79
occasions
0.79
Activations Density 0.126%