INDEX
Explanations
instances of the word "know."
New Auto-Interp
Negative Logits
Larsson
-0.59
Gill
-0.59
huriyet
-0.58
feu
-0.58
Schiller
-0.57
قای
-0.56
Tig
-0.55
Efforts
-0.55
گذاری
-0.55
TEntity
-0.55
POSITIVE LOGITS
Know
1.89
knows
1.88
Know
1.88
know
1.86
know
1.85
KNOW
1.81
Knows
1.79
KNOW
1.78
knows
1.76
knew
1.72
Activations Density 0.072%