INDEX
Explanations
words related to leaving or exit
instances of the word "left"
New Auto-Interp
Negative Logits
MAT
-0.75
externalActionCode
-0.71
ITNESS
-0.68
oun
-0.68
Temperature
-0.66
MpServer
-0.65
CONT
-0.64
ount
-0.63
OTAL
-0.63
inarily
-0.62
POSITIVE LOGITS
wing
0.95
overs
0.94
wing
0.83
left
0.80
handed
0.72
left
0.71
Left
0.71
undone
0.71
uve
0.70
ward
0.69
Activations Density 0.033%