INDEX
Explanations
references to fiction or fictional works
New Auto-Interp
Negative Logits
Muh
-0.67
GEBURTS
-0.66
Koenig
-0.64
Kase
-0.64
kysy
-0.63
Precautionary
-0.63
posia
-0.63
MessageOf
-0.62
atra
-0.61
Eich
-0.61
POSITIVE LOGITS
fiction
1.34
Fiction
1.32
FICTION
1.20
Fiction
1.15
Fic
1.03
fiction
1.00
Fic
1.00
fictional
0.97
fic
0.96
ficción
0.92
Activations Density 0.005%