INDEX
Explanations
references to traditional marriage and religious customs
New Auto-Interp
Negative Logits
εβ
-0.16
erland
-0.16
Hydra
-0.16
vill
-0.16
elah
-0.15
LLL
-0.15
JKLMNOP
-0.15
920
-0.14
/unit
-0.14
.Unit
-0.14
POSITIVE LOGITS
temple
0.50
Temple
0.46
temples
0.42
Temp
0.41
temp
0.39
Tem
0.38
temp
0.36
TEM
0.36
-temp
0.35
Temp
0.33
Activations Density 0.176%