INDEX
Explanations
instances of dialogue or quotes in the text
New Auto-Interp
Negative Logits
warts
-0.16
gers
-0.16
cplusplus
-0.15
orca
-0.15
hem
-0.15
itura
-0.14
ea
-0.14
hap
-0.14
ikip
-0.14
enko
-0.14
POSITIVE LOGITS
Moder
0.16
åĬ¿
0.14
_publisher
0.14
elay
0.14
Moder
0.13
abr
0.13
revolves
0.13
gfx
0.13
Gain
0.13
Luxury
0.13
Activations Density 0.006%