INDEX
Explanations
contractions and possessive forms in the text
New Auto-Interp
Negative Logits
ModelExpression
-0.56
вік
-0.55
Feu
-0.55
pia
-0.55
trans
-0.53
Hein
-0.53
record
-0.52
Jacobsen
-0.52
$=$
-0.52
)
-0.52
POSITIVE LOGITS
houſe
0.92
Houſe
0.84
purpoſe
0.83
whoſe
0.80
Pilate
0.79
myſelf
0.79
Athenians
0.78
abſ
0.78
uſe
0.78
deſt
0.77
Activations Density 0.128%