INDEX
Explanations
the words related to originality, especially focusing on the term "Original" with a very high activation value
mentions of originality in creative works
New Auto-Interp
Negative Logits
wal
-0.75
walk
-0.72
ucket
-0.72
=-=-=-=-=-=-=-=-
-0.71
oned
-0.67
rolet
-0.67
ennett
-0.66
ega
-0.66
washer
-0.65
Simulator
-0.64
POSITIVE LOGITS
ity
1.26
ITY
1.07
izations
0.90
ities
0.81
ité
0.80
lly
0.75
itiz
0.74
itized
0.74
ITIES
0.73
smanship
0.73
Activations Density 0.017%