INDEX
Explanations
references to political actions and their implications
New Auto-Interp
Negative Logits
kasarigan
-0.85
Roskov
-0.77
كومونز
-0.73
GEBURTS
-0.73
DebuggerNonUser
-0.73
PerformLayout
-0.71
đu
-0.70
archiviato
-0.66
conftest
-0.65
للمعارف
-0.65
POSITIVE LOGITS
instead
0.58
Instead
0.54
Instead
0.53
vielmehr
0.52
instead
0.48
sebaliknya
0.45
general
0.43
murni
0.43
rather
0.42
generales
0.42
Activations Density 0.585%