INDEX
Explanations
references to specific numbers and measurements
numerical data with a focus on statistics and measurements
New Auto-Interp
Negative Logits
estine
-0.81
hood
-0.74
erate
-0.72
uscript
-0.69
pton
-0.69
friend
-0.65
ened
-0.64
istically
-0.64
ying
-0.64
ledged
-0.63
POSITIVE LOGITS
ILCS
1.01
mph
0.88
entrants
0.80
lbs
0.76
chars
0.74
ANT
0.73
MPH
0.72
lbs
0.72
mph
0.71
ants
0.70
Activations Density 0.082%