INDEX
Explanations
terms related to non-fiction and fiction genres
New Auto-Interp
Negative Logits
zn
-0.16
timeofday
-0.16
sost
-0.15
stvo
-0.15
$MESS
-0.15
acha
-0.15
uz
-0.15
aeda
-0.14
ê±°ëŀĺê°Ģ
-0.14
igans
-0.14
POSITIVE LOGITS
nat
0.16
subjects
0.15
indow
0.15
aler
0.14
antz
0.14
inet
0.14
ÏįÏĢ
0.14
çıł
0.14
енко
0.14
homepage
0.14
Activations Density 0.002%