INDEX
Explanations
mentions of the word "Can" followed by another word
the phrase "Can" and its variations highlighting potential or capability
New Auto-Interp
Negative Logits
itbart
-0.62
RAG
-0.59
aign
-0.59
geist
-0.58
grievance
-0.58
ãĥŃ
-0.57
rals
-0.57
llor
-0.57
hem
-0.56
iations
-0.55
POSITIVE LOGITS
Can
3.28
Can
2.42
can
1.85
Could
1.70
Must
1.69
CAN
1.64
CAN
1.63
Are
1.52
Does
1.51
Cannot
1.49
Activations Density 0.017%