INDEX
Explanations
negative contractions related to existence or presence
New Auto-Interp
Negative Logits
<?
-0.66
Spoon
-0.66
bib
-0.64
-0.64
parap
-0.63
ſei
-0.62
Panorama
-0.62
alleye
-0.61
tif
-0.61
geſch
-0.61
POSITIVE LOGITS
vôtre
0.56
quien
0.52
dieux
0.51
nôtre
0.51
deudas
0.49
isn
0.48
costes
0.47
antemano
0.44
jouet
0.43
précédents
0.43
Activations Density 0.286%