INDEX
Explanations
phrases related to potential or possibility
references to potential situations or events
New Auto-Interp
Negative Logits
masters
-0.96
cipline
-0.83
rike
-0.83
erie
-0.82
bane
-0.82
gall
-0.81
gar
-0.80
men
-0.80
worn
-0.80
rix
-0.80
POSITIVE LOGITS
inclusion
0.81
Parenthood
0.81
defect
0.79
embodiments
0.78
inference
0.78
obstruction
0.76
outcomes
0.76
explanations
0.75
guiActiveUn
0.74
demise
0.73
Activations Density 0.025%