INDEX
Explanations
proper nouns
instances of the verb "was"
New Auto-Interp
Negative Logits
Extend
-0.69
Make
-0.68
IMAGES
-0.67
izable
-0.66
entails
-0.66
HAVE
-0.66
Which
-0.65
Compare
-0.65
Kids
-0.64
holders
-0.64
POSITIVE LOGITS
able
1.20
born
1.19
unable
1.05
originally
1.01
supposed
0.99
hes
0.99
tasked
0.97
instrumental
0.96
arrested
0.96
sentenced
0.95
Activations Density 0.348%