INDEX
Explanations
punctuation and dialogue markers in text
New Auto-Interp
Negative Logits
Germ
-0.15
Hum
-0.14
Fot
-0.14
arta
-0.13
irtual
-0.13
Mens
-0.13
ruh
-0.13
Warm
-0.13
meni
-0.13
vars
-0.12
POSITIVE LOGITS
s
0.18
erto
0.15
té
0.15
rades
0.15
å©·
0.14
emax
0.14
ÏĦον
0.14
alling
0.14
sink
0.14
acon
0.14
Activations Density 0.045%