INDEX
Explanations
references to cultural or geographical terms and concepts
references to various ethnic groups, particularly those with "arian" in their names
New Auto-Interp
Negative Logits
touching
-0.63
Multi
-0.62
Wo
-0.61
perf
-0.61
foul
-0.60
lasting
-0.59
handles
-0.59
wond
-0.59
rout
-0.58
disabled
-0.58
POSITIVE LOGITS
arian
4.96
arians
3.84
aria
1.94
arius
1.74
ary
1.55
arial
1.52
arist
1.51
ari
1.49
aries
1.49
ariat
1.45
Activations Density 0.009%