INDEX
Explanations
words related to extreme behavior or intensity
words related to extreme behavior or mental states
New Auto-Interp
Negative Logits
Columbia
-0.69
hyd
-0.65
uter
-0.65
GH
-0.65
Nile
-0.63
Ô
-0.61
Nag
-0.60
Atlantis
-0.60
SYSTEM
-0.60
pta
-0.59
POSITIVE LOGITS
iness
1.15
craz
1.05
edly
1.00
es
1.00
iest
0.95
ese
0.92
ed
0.88
rieg
0.88
ipedia
0.87
hou
0.85
Activations Density 0.014%