INDEX
Explanations
references to the passage of time, specifically years
in recent years
New Auto-Interp
Negative Logits
ſta
-0.46
LEGGI
-0.40
enfans
-0.38
diable
-0.38
noDo
-0.37
unnitel
-0.37
rigue
-0.37
ſte
-0.36
dummy
-0.36
virgin
-0.36
POSITIVE LOGITS
فريبيس
0.60
近年
0.59
geleden
0.57
विश्वसनीयता
0.57
Hinsicht
0.56
guère
0.55
rogels
0.55
decades
0.55
Few
0.55
רבים
0.55
Activations Density 0.005%