INDEX
Explanations
detailed descriptions of events and situations
New Auto-Interp
Negative Logits
olves
-0.78
SourceFile
-0.76
Coverage
-0.66
zing
-0.64
externalActionCode
-0.63
nexus
-0.62
aukee
-0.61
clave
-0.61
ORPG
-0.60
ÅĤ
-0.60
POSITIVE LOGITS
alas
1.21
moreover
1.00
secondly
0.98
interestingly
0.97
unsurprisingly
0.96
cru
0.96
uh
0.93
furthermore
0.92
yes
0.91
frankly
0.91
Activations Density 0.479%