INDEX
Explanations
reported or alleged actions
New Auto-Interp
Negative Logits
sabe
1.04
knows
0.97
understands
0.93
compréhension
0.91
understand
0.89
compreender
0.89
entiende
0.88
जानते
0.86
理解
0.86
compreensão
0.85
POSITIVE LOGITS
reportedly
2.33
allegedly
1.85
purportedly
1.68
announced
1.67
criticized
1.65
कथित
1.64
declared
1.58
protested
1.58
reported
1.54
repeatedly
1.48
Activations Density 0.090%