INDEX
Explanations
elements related to writing and communication
New Auto-Interp
Negative Logits
Physical
-0.16
insky
-0.16
physical
-0.16
Physical
-0.15
phys
-0.15
_physical
-0.15
oved
-0.15
phys
-0.14
physical
-0.14
xn
-0.14
POSITIVE LOGITS
writing
0.31
Writing
0.30
Writing
0.27
writing
0.25
-writing
0.23
writers
0.23
Writer
0.22
Writers
0.21
writer
0.20
Write
0.19
Activations Density 0.189%