INDEX
Explanations
the word "vast."
references to large quantities or mass populations
New Auto-Interp
Negative Logits
headers
-0.74
WAR
-0.69
Dialogue
-0.69
iffe
-0.66
MI
-0.65
apple
-0.64
PT
-0.64
verbs
-0.64
clicked
-0.64
girls
-0.63
POSITIVE LOGITS
majority
0.98
amounts
0.97
swat
0.93
quantities
0.90
swath
0.86
expans
0.84
ness
0.84
sums
0.78
itud
0.77
gulf
0.77
Activations Density 0.018%