INDEX
Explanations
instances of mistaken actions or beliefs
instances of the word "mistaken" and related terms indicating errors or misunderstandings
New Auto-Interp
Negative Logits
boats
-0.78
ILA
-0.74
acho
-0.70
tsky
-0.69
aza
-0.67
arya
-0.66
bleacher
-0.66
esta
-0.65
instein
-0.65
ktop
-0.65
POSITIVE LOGITS
mistaken
0.82
perceptions
0.76
misunderstanding
0.75
LY
0.72
unfocusedRange
0.72
assumptions
0.72
erroneous
0.71
Lowry
0.71
wrongly
0.70
impression
0.70
Activations Density 0.052%