INDEX
Explanations
remains unchanged or intact
New Auto-Interp
Negative Logits
segera
0.48
préal
0.47
okam
0.47
сразу
0.47
imediatamente
0.46
sofort
0.46
immédiatement
0.46
odmah
0.46
immédi
0.45
dès
0.44
POSITIVE LOGITS
indefinitely
0.89
intact
0.85
throughout
0.75
longer
0.71
unchanged
0.70
despite
0.64
dłu
0.64
Longer
0.62
बरकरार
0.61
不变
0.59
Activations Density 0.027%