INDEX
Explanations
numerical values or measurements
key actions or events that signify significant outcomes or developments
New Auto-Interp
Negative Logits
ãĥ¼ãĥĨ
-0.63
incent
-0.62
princ
-0.62
deem
-0.62
councill
-0.62
ò
-0.60
teasp
-0.60
citiz
-0.59
pione
-0.58
env
-0.58
POSITIVE LOGITS
.
1.11
.;
1.05
.:
1.03
.,
1.02
.?
0.98
.-
0.97
.–
0.92
.(
0.87
.--
0.83
./
0.81
Activations Density 0.372%