INDEX
Explanations
words related to materials or substances
references to various styles and forms of artistic or aesthetic expression
New Auto-Interp
Negative Logits
WARN
-0.77
ajor
-0.76
INESS
-0.74
女
-0.72
Miko
-0.72
izen
-0.70
ITNESS
-0.69
animous
-0.66
nesota
-0.66
Immunity
-0.65
POSITIVE LOGITS
sty
1.30
les
0.88
led
0.86
rous
0.82
lers
0.81
rene
0.81
aky
0.80
Sty
0.80
gian
0.79
rator
0.78
Activations Density 0.014%