INDEX
Explanations
apostrophes, although it has stronger reactions for more common uses of the symbol
New Auto-Interp
Negative Logits
Theſe
-0.89
Datuak
-0.84
Houſe
-0.84
leaſt
-0.79
gills
-0.78
Efq
-0.78
houſe
-0.77
iſt
-0.77
purpoſe
-0.77
ſch
-0.77
POSITIVE LOGITS
´
1.11
´
0.97
er
0.86
tirol
0.67
τρο
0.67
ER
0.65
erty
0.65
sinar
0.65
Nugent
0.65
Stur
0.64
Activations Density 0.002%