INDEX
Explanations
concepts related to virtue and moral characteristics
New Auto-Interp
Negative Logits
Roskov
-1.00
дописавши
-0.90
évaluateur
-0.82
endpush
-0.82
Aidan
-0.78
TAINMENT
-0.77
HasAnnotation
-0.77
Ander
-0.75
bouts
-0.75
plagio
-0.75
POSITIVE LOGITS
Vir
1.53
vir
1.45
vir
1.38
VIR
1.34
Vir
1.32
virtue
1.23
virulence
1.12
VIR
1.11
Virtue
1.09
Virt
1.05
Activations Density 0.006%