INDEX
Explanations
words related to mental states such as agitation, confusion, and paranoia
words or phrases associated with various forms of "ness," indicating quality or state
New Auto-Interp
Negative Logits
ODE
-0.71
Auschwitz
-0.65
bern
-0.64
verbs
-0.64
ORN
-0.62
ask
-0.62
WAR
-0.62
ellar
-0.61
amen
-0.60
veh
-0.60
POSITIVE LOGITS
iness
1.16
terness
0.97
ness
0.93
yy
0.87
nesses
0.85
ionage
0.82
ĪĴ
0.81
hip
0.79
yk
0.79
edly
0.73
Activations Density 0.027%