INDEX
Explanations
phrases related to triggers or activations
words related to traditional or formal categories
New Auto-Interp
Negative Logits
condem
-0.59
livelihood
-0.58
medium
-0.58
innocence
-0.57
fetch
-0.56
appraisal
-0.55
welcome
-0.55
Reincarn
-0.54
Prohibition
-0.54
ransom
-0.53
POSITIVE LOGITS
itionally
1.04
lycer
0.96
portation
0.86
bish
0.85
MpServer
0.77
ulously
0.74
imensional
0.73
ulence
0.73
ecast
0.73
ajo
0.73
Activations Density 0.069%