INDEX
Explanations
references to returning to a previous state or normalcy
New Auto-Interp
Negative Logits
unterschied
-0.45
tambahan
-0.43
hindurch
-0.42
Hintergrund
-0.41
arrivare
-0.39
resulting
-0.39
reactstrap
-0.39
llegar
-0.39
supplémentaire
-0.38
zatím
-0.38
POSITIVE LOGITS
normalcy
1.00
normality
0.99
normalidad
0.82
basics
0.78
semula
0.75
normal
0.74
lost
0.72
original
0.70
familiar
0.70
原来的
0.70
Activations Density 0.328%