INDEX
Explanations
references to actions, events, and discussions in a narrative context
New Auto-Interp
Negative Logits
Vers
-0.70
SPD
-0.70
juven
-0.66
dismant
-0.66
pursu
-0.63
millenn
-0.63
Techn
-0.61
predicate
-0.61
stag
-0.61
traged
-0.61
POSITIVE LOGITS
senal
0.99
rael
0.98
tics
0.96
Ĥİ
0.95
guiActiveUnfocused
0.86
ws
0.84
abella
0.83
omorphic
0.83
wolves
0.83
sis
0.82
Activations Density 0.366%