INDEX
Explanations
various sentences or statements, likely focusing on news or commentary
New Auto-Interp
Negative Logits
induct
-0.84
grounding
-0.82
sunset
-0.80
dispers
-0.80
galvan
-0.79
electr
-0.79
portraits
-0.78
stocking
-0.76
sidel
-0.75
portrait
-0.75
POSITIVE LOGITS
com
1.51
org
1.34
fm
1.32
edu
1.28
net
1.25
Org
1.22
blogspot
1.18
exe
1.16
info
1.16
tv
1.14
Activations Density 0.246%