INDEX
Explanations
proper nouns related to political figures named Collins
occurrences of the name "Collins."
New Auto-Interp
Negative Logits
ICAN
-0.71
discrep
-0.70
ilitation
-0.70
itia
-0.65
ETA
-0.65
iddled
-0.63
inished
-0.63
OOD
-0.61
istic
-0.60
hoc
-0.60
POSITIVE LOGITS
worth
1.26
mount
0.86
ville
0.82
ively
0.82
Collins
0.82
es
0.80
mann
0.80
leans
0.79
creen
0.76
boro
0.75
Activations Density 0.033%