INDEX
Explanations
instances of state changes or transitions occurring
New Auto-Interp
Negative Logits
stället
-0.57
uyler
-0.57
feinander
-0.55
allmän
-0.54
supérieurs
-0.54
paign
-0.53
högre
-0.53
harapkan
-0.53
Awak
-0.53
DRO
-0.53
POSITIVE LOGITS
depending
1.01
depending
0.87
Depending
0.70
dependiendo
0.69
respectively
0.66
Depending
0.65
kasarigan
0.64
enderror
0.62
autorytatywna
0.61
protoimpl
0.56
Activations Density 0.398%