INDEX
Explanations
references to old versus new concepts, particularly in the context of societal change
New Auto-Interp
Negative Logits
nonUne
-0.41
CppMethod
-0.36
propOrder
-0.30
säll
-0.29
tils
-0.27
ęp
-0.27
ByUrl
-0.27
nakalista
-0.27
でしょう
-0.27
Vortrag
-0.26
POSITIVE LOGITS
old
0.99
旧
0.97
舊
0.95
Old
0.90
旧
0.89
old
0.88
Old
0.87
régi
0.85
vecchio
0.84
vecchi
0.84
Activations Density 0.121%