INDEX
Explanations
language relating to software or technology
New Auto-Interp
Negative Logits
manif
-0.73
:]
-0.72
elim
-0.66
quartered
-0.65
recomp
-0.63
fres
-0.61
chwitz
-0.61
remod
-0.61
sic
-0.59
rigging
-0.59
POSITIVE LOGITS
1.18
1.14
Tumblr
1.10
Tumblr
1.07
1.03
1.02
1.02
1.01
0.99
0.97
Activations Density 1.617%