INDEX
Explanations
phrases indicating comprehension or understanding
understanding or lack thereof
New Auto-Interp
Negative Logits
깐
-0.51
consommate
-0.44
kecelakaan
-0.41
Linki
-0.41
egz
-0.40
Otras
-0.39
aveug
-0.39
ब्रेकडाउन
-0.39
agência
-0.39
mutiara
-0.38
POSITIVE LOGITS
understood
1.75
understood
1.54
Understood
1.23
misunderstood
0.85
stood
0.84
comprehended
0.78
verstanden
0.76
defined
0.71
Defined
0.71
Knew
0.66
Activations Density 0.004%