INDEX
Explanations
first-person statements containing caveats or conditions
variations of the word "this"
"this" in different languages
New Auto-Interp
Negative Logits
way
-0.47
它们
-0.40
它們
-0.40
getValueAt
-0.37
يتيمه
-0.36
Here
-0.35
respectively
-0.35
way
-0.33
itatea
-0.33
idade
-0.32
POSITIVE LOGITS
this
2.89
this
2.13
questo
1.73
этого
1.71
THIS
1.66
questa
1.55
هذا
1.52
этот
1.52
diesem
1.51
этом
1.51
Activations Density 7.547%