INDEX
Explanations
words related to directions and positions, specifically the term "Left" with different numerical values indicating varying degrees of relevance or importance
instances of the word "left."
New Auto-Interp
Negative Logits
glomer
-0.77
andise
-0.77
riott
-0.70
issance
-0.70
idated
-0.68
è¦ļéĨĴ
-0.68
conduc
-0.67
alez
-0.65
NRS
-0.65
antz
-0.65
POSITIVE LOGITS
overs
1.27
wing
1.10
hander
0.94
wich
0.89
ward
0.89
fing
0.86
wing
0.80
handed
0.80
handed
0.80
hemisphere
0.78
Activations Density 0.041%