INDEX
Explanations
mentions of returning or resuming states or activities
New Auto-Interp
Negative Logits
Jeografia
-0.94
myſelf
-0.89
itſelf
-0.84
autorytatywna
-0.81
preſent
-0.81
ujednoznacz
-0.77
Reſ
-0.77
ſche
-0.77
pleaſure
-0.76
humaines
-0.75
POSITIVE LOGITS
เต็ม
0.48
fili
0.44
0.43
igan
0.42
Vegeu
0.42
A
0.42
es
0.41
dis
0.41
umeur
0.41
Moyen
0.41
Activations Density 0.076%