INDEX
Explanations
speaker's past experiences and observations
New Auto-Interp
Negative Logits
resulted
1.06
resembled
0.97
其
0.95
attracted
0.93
its
0.90
necessitated
0.90
underwent
0.89
transpired
0.88
occurred
0.88
originated
0.88
POSITIVE LOGITS
heard
1.13
known
1.11
Known
1.09
conocido
1.07
Known
1.06
never
1.03
conhecido
1.03
извест
0.98
conocida
0.97
hears
0.96
Activations Density 0.228%