INDEX
Explanations
phrases related to closing, stopping, or shutting down something
phrases associated with being silenced or shut down
New Auto-Interp
Negative Logits
ĸļ
-0.69
IVES
-0.66
ordan
-0.63
arya
-0.62
abund
-0.62
ampton
-0.59
Amit
-0.58
ãĥīãĥ©ãĤ´ãĥ³
-0.58
covari
-0.57
ALLY
-0.57
POSITIVE LOGITS
tered
1.60
tering
1.43
downs
1.14
down
1.13
ters
1.07
down
1.05
tle
1.05
offs
0.92
own
0.91
downs
0.88
Activations Density 0.026%