INDEX
Explanations
references to traditions and traditional practices
New Auto-Interp
Negative Logits
ages
-0.16
age
-0.16
manship
-0.15
olla
-0.15
Descriptors
-0.14
bras
-0.14
agrid
-0.13
aes
-0.13
oll
-0.13
roughly
-0.13
POSITIVE LOGITS
ized
0.23
ists
0.21
ised
0.20
arily
0.18
ALLY
0.17
istic
0.17
itionally
0.17
ize
0.16
/current
0.16
ively
0.16
Activations Density 0.035%