INDEX
Explanations
the word "left" in various contexts
instances of the word "left."
New Auto-Interp
Negative Logits
alez
-0.85
glomer
-0.79
andise
-0.77
displayText
-0.75
mathemat
-0.73
idated
-0.72
conduc
-0.72
riott
-0.69
issance
-0.69
Wan
-0.67
POSITIVE LOGITS
overs
1.20
wing
1.11
hander
0.88
wing
0.87
ward
0.85
wich
0.84
fing
0.80
hemisphere
0.79
undone
0.76
handed
0.76
Activations Density 0.031%