INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
bold
-0.88
ï¸
-0.77
oak
-0.76
wagen
-0.72
oufl
-0.67
Gloss
-0.67
fixme
-0.67
Roses
-0.66
^^^^
-0.66
Wee
-0.66
POSITIVE LOGITS
worshipped
0.73
developed
0.72
starved
0.70
hered
0.68
awakened
0.67
apsed
0.67
communal
0.66
fused
0.66
hunger
0.64
Saharan
0.64
Activations Density 0.000%
No Known Activations
This feature has no known activations.