INDEX
Explanations
references to famous individuals and historical events
New Auto-Interp
Negative Logits
Pearce
-0.62
strangers
-0.57
erect
-0.57
imaginary
-0.53
Nile
-0.52
breeds
-0.52
IMAGES
-0.52
Rumble
-0.50
Mirage
-0.50
Mongo
-0.49
POSITIVE LOGITS
atural
0.79
otom
0.74
cream
0.72
ules
0.72
kowski
0.71
pool
0.70
ette
0.70
creen
0.70
ellen
0.69
lay
0.69
Activations Density 6.848%