INDEX
Explanations
references to the word "original"
references to the concept of "originality" or "original" content
New Auto-Interp
Negative Logits
wal
-0.75
walk
-0.74
rogens
-0.70
robe
-0.68
=-=-=-=-=-=-=-=-
-0.68
opping
-0.67
abad
-0.67
asia
-0.67
orney
-0.66
owler
-0.65
POSITIVE LOGITS
ITY
0.96
incarnation
0.96
ity
0.84
trilogy
0.83
version
0.79
batch
0.79
impetus
0.78
ãĥīãĥ©ãĤ´ãĥ³
0.77
isations
0.75
istar
0.75
Activations Density 0.014%