INDEX
Explanations
phrases indicating progress, change, or development over time
change over time
New Auto-Interp
Negative Logits
noDo
-0.69
ſte
-0.61
ſche
-0.60
styleType
-0.59
ViewImports
-0.59
OGND
-0.59
principalColumn
-0.57
expandindo
-0.56
windowFixed
-0.56
poro
-0.54
POSITIVE LOGITS
sejak
0.48
since
0.48
sinds
0.47
depuis
0.46
mudou
0.44
相比
0.44
zmieni
0.43
desde
0.43
evolved
0.42
cambiado
0.41
Activations Density 0.020%