INDEX
Explanations
names of public figures or political events
New Auto-Interp
Negative Logits
blender
-0.70
convict
-0.66
scrim
-0.66
bloom
-0.65
elim
-0.65
subdivision
-0.65
Assy
-0.63
Belfast
-0.62
Buenos
-0.62
Bulgarian
-0.62
POSITIVE LOGITS
own
1.02
tsy
0.95
ï¸ı
0.94
tre
0.91
self
0.91
shall
0.87
iversary
0.85
s
0.85
ship
0.84
daughter
0.84
Activations Density 0.190%