INDEX
Explanations
references to writers
references to writers and writing
New Auto-Interp
Negative Logits
illon
-0.83
rals
-0.77
CCTV
-0.77
Ĥª
-0.75
Ĭ±
-0.68
Lumpur
-0.66
Sensor
-0.66
EGA
-0.66
avior
-0.64
hens
-0.63
POSITIVE LOGITS
writing
0.99
laureate
0.98
writer
0.91
writers
0.87
uscript
0.86
writ
0.86
hip
0.84
Beware
0.81
fiction
0.78
smanship
0.76
Activations Density 0.076%