INDEX
Explanations
proper nouns and names, potentially focusing on political figures, events, and locations
New Auto-Interp
Negative Logits
PsyNetMessage
-0.96
GEAR
-0.76
igslist
-0.73
aeda
-0.71
anwhile
-0.70
umar
-0.68
anova
-0.66
UTE
-0.65
Yugoslav
-0.65
mileage
-0.64
POSITIVE LOGITS
ewater
1.29
bread
1.17
eness
1.10
eless
1.07
ened
1.03
ening
1.02
elist
0.99
estone
0.97
robe
0.95
ecast
0.94
Activations Density 0.089%