INDEX
Explanations
instances of the word "this" and related demonstrative references
this indicates a concept
New Auto-Interp
Negative Logits
ArgsConstructor
-0.41
goddamn
-0.38
HasColumnName
-0.35
here
-0.35
:✨
-0.34
⇀
-0.34
bomb
-0.34
damn
-0.33
shit
-0.33
fucking
-0.33
POSITIVE LOGITS
lenker
0.63
seda
0.62
tłuma
0.58
związane
0.56
సౌకర్య
0.56
Nacionales
0.55
felicitación
0.54
Hilfs
0.54
häufigsten
0.54
üblichen
0.54
Activations Density 0.152%