INDEX
Explanations
proper nouns or trademarks
references to the term "original" in various contexts
New Auto-Interp
Negative Logits
wal
-0.84
rosso
-0.72
ega
-0.71
Adv
-0.70
gerald
-0.70
rom
-0.68
rology
-0.68
roph
-0.68
riz
-0.67
rop
-0.67
POSITIVE LOGITS
ity
1.33
ITY
0.94
impetus
0.90
trilogy
0.87
sin
0.82
intent
0.81
conception
0.80
iator
0.79
intention
0.79
incarnation
0.78
Activations Density 0.034%