INDEX
Explanations
references to originality or original content
references to original content or originality in various contexts
New Auto-Interp
Negative Logits
ucket
-0.72
Simulator
-0.70
wal
-0.69
ega
-0.67
roph
-0.66
rom
-0.66
angs
-0.65
rolet
-0.65
walk
-0.64
ower
-0.63
POSITIVE LOGITS
ity
1.41
ITY
1.10
izations
0.95
ities
0.86
ité
0.85
lly
0.85
itiz
0.82
itized
0.81
iator
0.80
smanship
0.79
Activations Density 0.020%