INDEX
Explanations
content related to crime and punishment in socio-political contexts
New Auto-Interp
Negative Logits
tagHelperRunner
-0.60
gynhyrchwyd
-0.59
GEBURTS
-0.59
avoient
-0.58
myſelf
-0.57
feroit
-0.56
Perſ
-0.53
houſe
-0.52
balleur
-0.52
verwijspagina
-0.52
POSITIVE LOGITS
compares
0.44
worst
0.40
worse
0.39
kı
0.39
序
0.38
compared
0.37
signaling
0.36
比
0.35
ob
0.34
compar
0.34
Activations Density 0.147%