INDEX
Explanations
phrases related to spreading information or completing tasks
New Auto-Interp
Negative Logits
unexplained
-0.69
atan
-0.66
unknown
-0.65
incomp
-0.63
ilial
-0.61
zens
-0.57
unidentified
-0.56
burd
-0.56
gling
-0.55
inher
-0.55
POSITIVE LOGITS
ASAP
0.92
sooner
0.85
quicker
0.78
cheaply
0.77
faster
0.68
istors
0.68
bin
0.67
vo
0.63
humming
0.62
easier
0.62
Activations Density 0.207%