INDEX
Explanations
various instances of the word "ideas"
references to innovative or philosophical concepts
New Auto-Interp
Negative Logits
ded
-0.71
Delivery
-0.70
Military
-0.67
Mamm
-0.66
ORD
-0.65
occupancy
-0.64
ALL
-0.63
sole
-0.61
enary
-0.60
effect
-0.60
POSITIVE LOGITS
ensical
0.92
ideas
0.84
matter
0.83
ĸļ
0.81
uggest
0.81
mith
0.78
omething
0.77
Ideas
0.75
cape
0.74
theories
0.74
Activations Density 0.064%