INDEX
Negative Logits
.variant
-0.07
geen
-0.06
tir
-0.06
coincide
-0.06
tug
-0.06
primo
-0.06
-0.06
","","
-0.06
avy
-0.06
otel
-0.05
POSITIVE LOGITS
shaw
0.06
ohan
0.06
Tropical
0.06
cuck
0.06
Logger
0.06
preter
0.06
ADATA
0.06
_results
0.06
brate
0.06
cept
0.06
Activations Density 0.028%