INDEX
Explanations
instances where someone is proven wrong
occurrences of the word "wrong."
New Auto-Interp
Negative Logits
Ri
-0.70
hens
-0.69
Flavoring
-0.67
incinn
-0.65
coni
-0.62
Fn
-0.61
hedral
-0.59
ats
-0.58
electric
-0.57
kamp
-0.57
POSITIVE LOGITS
fully
0.96
eous
0.95
guiActiveUn
0.81
headed
0.81
ed
0.74
sight
0.73
unfocusedRange
0.72
spin
0.70
doing
0.67
dest
0.66
Activations Density 0.018%