INDEX
Explanations
specific entities related to various fields like government, education, healthcare, and law
references to official institutions and organizations
New Auto-Interp
Negative Logits
slump
-0.73
groom
-0.70
fuss
-0.68
silence
-0.62
humili
-0.62
robbers
-0.62
dared
-0.60
prey
-0.59
complains
-0.59
ometric
-0.59
POSITIVE LOGITS
.''.
1.10
.).
0.97
.</
0.94
<|endoftext|>
0.94
.
0.91
UNCLASSIFIED
0.91
.;
0.86
.(
0.86
.[
0.86
*.
0.84
Activations Density 0.508%