INDEX
Negative Logits
sadistic
0.52
incest
0.51
hypothes
0.50
traitor
0.50
indoct
0.48
hypothesized
0.47
hallucin
0.46
ക്കണം
0.45
ሳት
0.45
hallucinations
0.45
POSITIVE LOGITS
stylish
0.98
élég
0.97
minimalist
0.95
elegante
0.95
design
0.94
디자인
0.93
elegance
0.92
elegant
0.90
sleek
0.88
diseño
0.87
Activations Density 0.080%