INDEX
Explanations
phrases related to searching or investigating
references to combative situations or actions
New Auto-Interp
Negative Logits
deals
-0.82
pod
-0.75
SAY
-0.69
hold
-0.69
mos
-0.68
cius
-0.67
profits
-0.66
medium
-0.65
minecraft
-0.65
demand
-0.63
POSITIVE LOGITS
comb
1.26
inators
1.02
inational
0.97
inator
0.85
uations
0.85
inatory
0.82
ihar
0.79
itud
0.79
ative
0.78
atform
0.77
Activations Density 0.006%